Paul Tremberth
07f9985a94
TST: Randomize FILES_EXPIRES above 90 days
2016-12-21 17:03:11 +01:00
Elias Dorneles
d09ec3db68
Merge pull request #2410 from redapple/fetch-transparent-redirect
...
[MRG+1] Transparently handle redirections in fetch and shell
2016-12-21 09:49:15 -02:00
Mikhail Korobov
d19c4c1f80
Merge pull request #2433 from redapple/wrong-spidermodules-warning
...
[MRG+1] Warn user instead of failing for wrong SPIDER_MODULES setting
2016-12-19 21:46:56 +05:00
Paul Tremberth
ed1e4d8df3
Merge pull request #1731 from scrapy/disable-toplevel-2
...
[MRG+1] LOG_SHORT_NAMES option to disable TopLevelFormatter
2016-12-19 16:02:55 +01:00
Elias Dorneles
97e82107b1
Merge pull request #2270 from redapple/httperror-log-info
...
[MRG+1] Raise log level for HttpErrorMiddleware to INFO (from DEBUG)
2016-12-14 12:18:04 -02:00
Paul Tremberth
f7e4081414
Add tests for SequenceExclude container
2016-12-12 22:37:53 +01:00
Mikhail Korobov
05b4555f39
TST tests for LOG_SHORT_NAMES
2016-12-09 02:19:51 +05:00
Mikhail Korobov
e46572d6f2
TST end-to-end test for LOG_LEVEL option
...
there were no end-to-end tests for this option
2016-12-09 02:19:33 +05:00
Mikhail Korobov
6eab59cbac
TST cleanup runspider tests
2016-12-09 02:14:12 +05:00
Paul Tremberth
948e3cd003
Warn user instead of failing for wrong SPIDER_MODULES setting
2016-12-08 12:50:26 +01:00
Paul Tremberth
2cd579a774
Add test for fetch(url) within shell with and without redirect
2016-12-07 19:07:32 +01:00
Paul Tremberth
7e54de2455
Add tests for shell command with and without --no-redirect
2016-12-07 18:41:24 +01:00
Elias Dorneles
cce631abec
Merge pull request #1887 from nyov/twisted11
...
[MRG+2] Bump Twisted dependency to 13.1.0
2016-12-07 15:00:01 -02:00
Paul Tremberth
778bed07bf
Let framework handle only HTTP redirects by default for fetch and shell commands
2016-12-07 17:56:13 +01:00
Mikhail Korobov
ff3aec6613
Merge pull request #2331 from moisesguimaraes/fixes-2272
...
[MRG+1] Fixes issue #2272 using arg_to_iter() to wrap single values and list() to…
2016-12-07 20:08:18 +05:00
Elias Dorneles
a9c69458ff
Merge pull request #2422 from rolando-contrib/nested-spiders-modules
...
[MRG+1] DOC State explicitly that spiders are loaded recursively.
2016-12-07 11:21:15 -02:00
Paul Tremberth
5efd65255c
TST: Randomize IMAGES_EXPIRES above 90 days
2016-12-06 18:49:53 +01:00
nyov
534772f6ea
Import xlib.tx code from twisted proper
2016-12-02 21:21:51 +00:00
Elias Dorneles
c4e67c0696
Merge pull request #2421 from rolando-contrib/tests-bug
...
[MRG+1] TST: Fix duplicated test name.
2016-12-01 18:43:48 -02:00
Elias Dorneles
f3a4420750
Merge pull request #2388 from redapple/robotparser-native-str
...
[MRG+1] Parse robots.txt content as native str
2016-12-01 18:43:23 -02:00
Rolando Espinoza
923b974f0a
TST Include nested a nested spider in spider loader test.
2016-12-01 13:26:19 -03:00
Rolando Espinoza
d9f43e21ba
TST: Fix duplicated test name.
2016-12-01 11:56:33 -03:00
Eugenio Lacuesta
5ff64ad015
handle relative sitemap urls in robots.txt
2016-12-01 09:53:40 -03:00
Paul Tremberth
9aefc0a886
Add test for fetch command with redirections disabled
2016-11-24 13:41:51 +01:00
Paul Tremberth
01142e2ae5
Print more dependencies versions in "scrapy version" verbose output
2016-11-22 14:48:33 +01:00
Paul Tremberth
6cd35c77da
Pass user-agent as native str when checking URLs against robots.txt
2016-11-15 17:38:32 +01:00
Paul Tremberth
de89b1b562
Merge pull request #2275 from scrapy/response-css-xpath-message
...
[MRG+1] Add better messages for when response content isn't text (closes #2264 )
2016-11-10 11:38:22 +01:00
Paul Tremberth
28155dfccc
Parse robots.txt content as native str
...
Fixes #2373
2016-11-09 12:20:06 +01:00
Paul Tremberth
af2280e695
Update docstring
2016-11-08 13:30:51 +01:00
Paul Tremberth
27456996a9
Add assertion on crawler not running
2016-11-08 11:46:16 +01:00
Paul Tremberth
61efacdd1f
Add testcase for catching exception from open_spider() from pipeline
2016-11-08 11:35:42 +01:00
Paul Tremberth
db40852892
Do not interpret non-ASCII bytes in "Location" and percent-encode them ( #2322 )
...
* Do not interpret non-ASCII bytes in "Location" and percent-encode them
Fixes GH-2321
The idea is to not guess the encoding of "Location" header value
and simply percent-encode non-ASCII bytes,
which should then be re-interpreted correctly by the remote website
in whatever encoding was used originally.
See https://tools.ietf.org/html/rfc3987#section-3.2
This is similar to the changes to safe_url_string in
https://github.com/scrapy/w3lib/pull/45
* Remove unused import
2016-10-19 23:26:12 -03:00
Moisés Guimarães
45e95b79ce
( fixes #2272 ) using arg_to_iter() to wrap single values and list() to avoid consuming from generators.
2016-10-18 11:06:55 -03:00
Elias Dorneles
2d932c173c
test abs path outside project as well
2016-09-30 15:07:58 -03:00
Elias Dorneles
25bd3b3fea
add .scrapy when outside spider too, add tests
2016-09-29 18:30:42 -03:00
Elias Dorneles
9c9690c76c
add better messages for when response content isn't text ( closes #2264 )
2016-09-21 10:30:35 -03:00
Paul Tremberth
81a0e3cd93
Raise log level for HttpErrorMiddleware to INFO (from DEBUG)
...
Fixes GH-910
2016-09-20 13:44:21 +02:00
Paul Tremberth
41cd9f401f
Merge pull request #2243 from pawelmhm/image-pipeline-2198
...
[MRG+1] [image & file pipeline] loading setting for user classes
2016-09-19 18:43:52 +02:00
Mikhail Korobov
5657f6b8ef
Merge pull request #2258 from redapple/feed-export-started
...
[MRG+1] Feed exporter: start exporting only on first item
2016-09-19 14:40:30 +06:00
Mikhail Korobov
552368727a
Merge pull request #2225 from Tethik/parse_command_rules_fix
...
[MRG+1] Two small fixes for when using the parse command and the '-r' flag (rules).
2016-09-19 14:39:09 +06:00
Joakim Uddholm
8c38dde4e8
Moved parse command tests to its own file. Added some checks to check for logged errors.
2016-09-19 05:33:05 +02:00
Paul Tremberth
03ab077249
Feed exporter: start exporting only on first item
...
Fixes GH-872
2016-09-17 01:36:56 +02:00
Paul Tremberth
b828facff4
Add shell test for using scrapy.Request() directly without importing scrapy
2016-09-15 19:25:20 +02:00
pawelmhm
7d88209543
[image & file pipeline] loading setting for user classes
...
if user has some custom subclass of Image pipeline and no setting for
this pipeline, he should get default settings defined for Image Pipeline.
Fixes #2198
2016-09-15 09:39:16 +02:00
Elias Dorneles
129421c7e3
Merge pull request #1503 from demelziraptor/amazon-json-response
...
[MRG+1] interpreting json-amazonui-streaming as TextResponse
2016-09-12 13:21:16 -03:00
Paul Tremberth
fbb5559299
Add tests for crawl command non-default cases
2016-09-12 13:35:14 +02:00
Paul Tremberth
9de6f1ca75
Merge pull request #1905 from rootAvish/duplication-fix
...
[MRG+1] Modified read failure recovery in utils/gz.py to read only the last f.extrasize bytes of f.extrabuf[ ]
2016-08-17 14:51:30 +02:00
Ashish Kulkarni
bb3b806467
Use w3lib.url.canonicalize_url() from w3lib 1.15.0
...
Also remove code/imports which are now unused due to this change.
fixes #2157
2016-08-16 17:42:16 +05:30
Paul Tremberth
9a734e6759
Merge pull request #2058 from dalleng/serialize_set
...
[MRG+1] Add set serialization to ScrapyJSONEncoder
2016-08-12 18:28:34 +02:00
rootavish
d9437fd3d9
Modifying existing gzip read failure recovery mechanism to patch read for broken archives
2016-08-11 18:21:42 +05:30