Paul Tremberth
f2ac24eb7b
Do not only warn on wrong spider modules for "scrapy list"
2017-03-09 17:38:15 +01:00
Paul Tremberth
36160f1b0a
Merge pull request #2477 from scrapy/recommend-anaconda-for-win
...
[MRG+2] docs: installation instructions, mention conda in the beginning (closes #2475 )
2017-03-09 12:56:50 +01:00
Paul Tremberth
c0cbaccb7b
Merge pull request #2581 from lopuhin/respect-custom-log-level
...
[MRG+1] Respect custom log level: fixes GH-1612
2017-03-09 12:47:20 +01:00
Paul Tremberth
db13ab5ec2
Merge pull request #2611 from delftswa2017/httpcachemiddleware_log
...
[MRG+1] [logging] HttpCacheMiddleware: log cache directory at instantiation
2017-03-09 12:44:10 +01:00
Paul Tremberth
c5e193d079
Merge pull request #2636 from delftswa2017/contrib_docs_fix
...
[MRG+1] Removed contrib section in contribution documentation
2017-03-09 12:42:59 +01:00
Paul Tremberth
9cfe9ae098
Do not use self.assertRaises() as context manager
2017-03-09 12:21:03 +01:00
jorenham
ac63d3a3cf
Removed contrib section; contrib is deprecated
2017-03-09 10:56:23 +01:00
Paul Tremberth
7be773e14a
Add SPIDER_LOADER_WARN_ONLY to toggle between spiderloader failure and warning
2017-03-07 17:40:40 +01:00
Mikhail Korobov
d3f8f3d38a
Merge pull request #2612 from redapple/dupe-spider-name-tests
...
[WIP] Add warning on duplicate spider name
2017-03-07 20:08:42 +05:00
Paul Tremberth
0b9a18e1a1
Warn only once for all spiders
2017-03-07 15:41:17 +01:00
Paul Tremberth
978306a223
Fix dupe spider name warning string tests
2017-03-07 14:48:16 +01:00
Paul Tremberth
a017f7b932
Warn about modules where duplicate spider names were found
2017-03-07 14:44:27 +01:00
Mikhail Korobov
802ed30e8e
Merge pull request #2391 from redapple/always-gunzip-content-enc-gzip
...
[MRG] Always decompress Content-Encoding: gzip at HttpCompression stage
2017-03-07 16:56:54 +05:00
Paul Tremberth
b6378c7ef6
Revert to using self.assert methods
2017-03-07 12:28:24 +01:00
Konstantin Lopuhin
6f55ca4643
Revert unneeded test_crawl changes
2017-03-07 14:20:52 +03:00
Paul Tremberth
4caceccd59
Use 3-bytes for gzip archive type sniffing
2017-03-07 11:02:46 +01:00
Paul Tremberth
b174744b80
Do not silently fail on gzip unzipping
2017-03-07 11:02:46 +01:00
Paul Tremberth
11cdf58abe
Always decompress Content-Encoding: gzip at HttpCompression stage
...
Let SitemapSpider handle decoding of .xml.gz files if necessary
2017-03-07 11:02:46 +01:00
Mikhail Korobov
6320b743cf
Merge pull request #2393 from redapple/response-replace-encoding
...
[MRG] Use body to choose response type after decompression content
2017-03-07 13:23:35 +05:00
Paul Tremberth
e42b846a9f
Use body to chose response type after decompression content
2017-03-06 23:10:38 +01:00
Mikhail Korobov
451f147468
Merge pull request #2628 from redapple/brotli-docs
...
DOC Mention brotli support in HttpCompressionMiddleware section
2017-03-06 22:16:21 +05:00
Mikhail Korobov
20c4f9f43a
Merge pull request #2627 from redapple/referrer-empty-string-header
...
Fix referrer policy from response headers and support explicit empty string
2017-03-06 21:44:08 +05:00
Paul Tremberth
2a7d391e0b
DOC Mention brotli support in HttpCompressionMiddleware section
2017-03-06 17:30:32 +01:00
Paul Tremberth
768f3155e5
Fix referrer policy from response headers and support explicit empty string
2017-03-06 16:20:37 +01:00
Paul Tremberth
d7b26edf6b
Merge pull request #2625 from redapple/release-notes-packaging-fix
...
Update release notes for 1.0.7, 1.1.4 and 1.2.3
2017-03-06 14:47:01 +01:00
Paul Tremberth
a8b47d6c68
Update release notes for 1.0.7, 1.1.4 and 1.2.3
2017-03-06 14:25:52 +01:00
Paul Tremberth
0a17f9b55c
Merge pull request #2334 from ArturGaspar/data_uri
...
[MRG+1] data URI download handler.
2017-03-06 13:50:46 +01:00
Mikhail Korobov
2c5fa0c130
Merge pull request #2617 from redapple/engine-dupe-statement
...
Remove redundant slot.add_request() call in ExecutionEngine
2017-03-03 21:44:07 +05:00
Paul Tremberth
30d812eea2
Remove redundant slot.add_request() call in ExecutionEngine
2017-03-03 17:15:37 +01:00
Paul Tremberth
c68f99eed8
Refactor settings tests
2017-03-03 17:03:25 +01:00
Paul Tremberth
c3b6feca0e
Fix setting lookup for MEDIA_ALLOWED_REDIRECTS
2017-03-03 16:29:07 +01:00
Konstantin Lopuhin
ef04cfd237
Respect log settings in custom_settings: fixes GH-1612
...
A new root logger is installed when a crawler is created
if one was already installed before.
This allows to respect custom settings related to logging,
such as LOG_LEVEL, LOG_FILE, etc.
2017-03-03 18:01:11 +03:00
Paul Tremberth
f7e11b198e
Cleanup
2017-03-03 16:00:59 +01:00
Paul Tremberth
ecde166ee1
Refactor without MEDIA_HTTPSTATUS_LIST setting
2017-03-03 15:52:05 +01:00
Paul Tremberth
72fbb687d7
Revert "expose allowed_status tuple for media pipeline"
...
This reverts commit 052809c73ed20b9a728a8fd7df3de5f45f2dad8d.
2017-03-03 15:52:05 +01:00
Paul Tremberth
c64ebee062
Refactor (WIP)
2017-03-03 15:52:05 +01:00
Bernardas
11b31c9fbd
fix redirect change
2017-03-03 15:52:05 +01:00
Bernardas
3cef1cd451
adjust variable wording and redirect logic
2017-03-03 15:52:05 +01:00
Bernardas
f0b4077f81
expose allowed_status tuple for media pipeline
2017-03-03 15:52:05 +01:00
Bernardas
2e052c8615
fix error when settings are not provided
2017-03-03 15:52:05 +01:00
Bernardas
8542780854
typo and clarify handling
2017-03-03 15:52:05 +01:00
Bernardas
6a42214716
add tests for media pipeline MEDIA_ALLOW_REDIRECTS and MEDIA_HTTPSTATUS_LIST settings
2017-03-03 15:52:05 +01:00
Bernardas
25ed491219
add description for media pipeline MEDIA_ALLOW_REDIRECTS and MEDIA_HTTPSTATUS_LIST settings
2017-03-03 15:52:05 +01:00
Bernardas
5d0058492c
add media pipeline settings to enable redirection and handling of certain http statuses
2017-03-03 15:52:05 +01:00
jorenham
5e89db5484
Moved cache dir logging to open_spider
in FilesystemCacheStorage for consistency
2017-03-03 15:32:20 +01:00
jorenham
42b429dc37
Log full cache file path instead of cache directory for the storages that cache to single files.
2017-03-03 15:15:59 +01:00
Mikhail Korobov
7e8453cf1e
Merge pull request #2306 from redapple/referrer-policy
...
[MRG] Referrer policies in RefererMiddleware
2017-03-03 04:05:13 +05:00
Paul Tremberth
fad499ab60
Rename arguments (bis)
2017-03-02 23:06:04 +01:00
Paul Tremberth
db176f872b
Remove Legacy Policy which is equivalent to UnsafeUrl Policy
2017-03-02 22:56:23 +01:00
Mikhail Korobov
93024c242b
Merge pull request #2537 from scrapy/no-canonicalize
...
[MRG+1] Set canonicalize=False for LinkExtractor
2017-03-03 02:53:36 +05:00