1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-24 00:23:52 +00:00

6326 Commits

Author SHA1 Message Date
Paul Tremberth
f2ac24eb7b Do not only warn on wrong spider modules for "scrapy list" 2017-03-09 17:38:15 +01:00
Paul Tremberth
36160f1b0a Merge pull request #2477 from scrapy/recommend-anaconda-for-win
[MRG+2] docs: installation instructions, mention conda in the beginning (closes #2475)
2017-03-09 12:56:50 +01:00
Paul Tremberth
c0cbaccb7b Merge pull request #2581 from lopuhin/respect-custom-log-level
[MRG+1] Respect custom log level: fixes GH-1612
2017-03-09 12:47:20 +01:00
Paul Tremberth
db13ab5ec2 Merge pull request #2611 from delftswa2017/httpcachemiddleware_log
[MRG+1] [logging] HttpCacheMiddleware: log cache directory at instantiation
2017-03-09 12:44:10 +01:00
Paul Tremberth
c5e193d079 Merge pull request #2636 from delftswa2017/contrib_docs_fix
[MRG+1] Removed contrib section in contribution documentation
2017-03-09 12:42:59 +01:00
Paul Tremberth
9cfe9ae098 Do not use self.assertRaises() as context manager 2017-03-09 12:21:03 +01:00
jorenham
ac63d3a3cf Removed contrib section; contrib is deprecated 2017-03-09 10:56:23 +01:00
Paul Tremberth
7be773e14a Add SPIDER_LOADER_WARN_ONLY to toggle between spiderloader failure and warning 2017-03-07 17:40:40 +01:00
Mikhail Korobov
d3f8f3d38a Merge pull request #2612 from redapple/dupe-spider-name-tests
[WIP] Add warning on duplicate spider name
2017-03-07 20:08:42 +05:00
Paul Tremberth
0b9a18e1a1 Warn only once for all spiders 2017-03-07 15:41:17 +01:00
Paul Tremberth
978306a223 Fix dupe spider name warning string tests 2017-03-07 14:48:16 +01:00
Paul Tremberth
a017f7b932 Warn about modules where duplicate spider names were found 2017-03-07 14:44:27 +01:00
Mikhail Korobov
802ed30e8e Merge pull request #2391 from redapple/always-gunzip-content-enc-gzip
[MRG] Always decompress Content-Encoding: gzip at HttpCompression stage
2017-03-07 16:56:54 +05:00
Paul Tremberth
b6378c7ef6 Revert to using self.assert methods 2017-03-07 12:28:24 +01:00
Konstantin Lopuhin
6f55ca4643 Revert unneeded test_crawl changes 2017-03-07 14:20:52 +03:00
Paul Tremberth
4caceccd59 Use 3-bytes for gzip archive type sniffing 2017-03-07 11:02:46 +01:00
Paul Tremberth
b174744b80 Do not silently fail on gzip unzipping 2017-03-07 11:02:46 +01:00
Paul Tremberth
11cdf58abe Always decompress Content-Encoding: gzip at HttpCompression stage
Let SitemapSpider handle decoding of .xml.gz files if necessary
2017-03-07 11:02:46 +01:00
Mikhail Korobov
6320b743cf Merge pull request #2393 from redapple/response-replace-encoding
[MRG] Use body to choose response type after decompression content
2017-03-07 13:23:35 +05:00
Paul Tremberth
e42b846a9f Use body to chose response type after decompression content 2017-03-06 23:10:38 +01:00
Mikhail Korobov
451f147468 Merge pull request #2628 from redapple/brotli-docs
DOC Mention brotli support in HttpCompressionMiddleware section
2017-03-06 22:16:21 +05:00
Mikhail Korobov
20c4f9f43a Merge pull request #2627 from redapple/referrer-empty-string-header
Fix referrer policy from response headers and support explicit empty string
2017-03-06 21:44:08 +05:00
Paul Tremberth
2a7d391e0b DOC Mention brotli support in HttpCompressionMiddleware section 2017-03-06 17:30:32 +01:00
Paul Tremberth
768f3155e5 Fix referrer policy from response headers and support explicit empty string 2017-03-06 16:20:37 +01:00
Paul Tremberth
d7b26edf6b Merge pull request #2625 from redapple/release-notes-packaging-fix
Update release notes for 1.0.7, 1.1.4 and 1.2.3
2017-03-06 14:47:01 +01:00
Paul Tremberth
a8b47d6c68 Update release notes for 1.0.7, 1.1.4 and 1.2.3 2017-03-06 14:25:52 +01:00
Paul Tremberth
0a17f9b55c Merge pull request #2334 from ArturGaspar/data_uri
[MRG+1] data URI download handler.
2017-03-06 13:50:46 +01:00
Mikhail Korobov
2c5fa0c130 Merge pull request #2617 from redapple/engine-dupe-statement
Remove redundant slot.add_request() call in ExecutionEngine
2017-03-03 21:44:07 +05:00
Paul Tremberth
30d812eea2 Remove redundant slot.add_request() call in ExecutionEngine 2017-03-03 17:15:37 +01:00
Paul Tremberth
c68f99eed8 Refactor settings tests 2017-03-03 17:03:25 +01:00
Paul Tremberth
c3b6feca0e Fix setting lookup for MEDIA_ALLOWED_REDIRECTS 2017-03-03 16:29:07 +01:00
Konstantin Lopuhin
ef04cfd237 Respect log settings in custom_settings: fixes GH-1612
A new root logger is installed when a crawler is created
if one was already installed before.
This allows to respect custom settings related to logging,
such as LOG_LEVEL, LOG_FILE, etc.
2017-03-03 18:01:11 +03:00
Paul Tremberth
f7e11b198e Cleanup 2017-03-03 16:00:59 +01:00
Paul Tremberth
ecde166ee1 Refactor without MEDIA_HTTPSTATUS_LIST setting 2017-03-03 15:52:05 +01:00
Paul Tremberth
72fbb687d7 Revert "expose allowed_status tuple for media pipeline"
This reverts commit 052809c73ed20b9a728a8fd7df3de5f45f2dad8d.
2017-03-03 15:52:05 +01:00
Paul Tremberth
c64ebee062 Refactor (WIP) 2017-03-03 15:52:05 +01:00
Bernardas
11b31c9fbd fix redirect change 2017-03-03 15:52:05 +01:00
Bernardas
3cef1cd451 adjust variable wording and redirect logic 2017-03-03 15:52:05 +01:00
Bernardas
f0b4077f81 expose allowed_status tuple for media pipeline 2017-03-03 15:52:05 +01:00
Bernardas
2e052c8615 fix error when settings are not provided 2017-03-03 15:52:05 +01:00
Bernardas
8542780854 typo and clarify handling 2017-03-03 15:52:05 +01:00
Bernardas
6a42214716 add tests for media pipeline MEDIA_ALLOW_REDIRECTS and MEDIA_HTTPSTATUS_LIST settings 2017-03-03 15:52:05 +01:00
Bernardas
25ed491219 add description for media pipeline MEDIA_ALLOW_REDIRECTS and MEDIA_HTTPSTATUS_LIST settings 2017-03-03 15:52:05 +01:00
Bernardas
5d0058492c add media pipeline settings to enable redirection and handling of certain http statuses 2017-03-03 15:52:05 +01:00
jorenham
5e89db5484 Moved cache dir logging to open_spider in FilesystemCacheStorage for consistency 2017-03-03 15:32:20 +01:00
jorenham
42b429dc37 Log full cache file path instead of cache directory for the storages that cache to single files. 2017-03-03 15:15:59 +01:00
Mikhail Korobov
7e8453cf1e Merge pull request #2306 from redapple/referrer-policy
[MRG] Referrer policies in RefererMiddleware
2017-03-03 04:05:13 +05:00
Paul Tremberth
fad499ab60 Rename arguments (bis) 2017-03-02 23:06:04 +01:00
Paul Tremberth
db176f872b Remove Legacy Policy which is equivalent to UnsafeUrl Policy 2017-03-02 22:56:23 +01:00
Mikhail Korobov
93024c242b Merge pull request #2537 from scrapy/no-canonicalize
[MRG+1] Set canonicalize=False for LinkExtractor
2017-03-03 02:53:36 +05:00