1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-25 07:24:09 +00:00

724 Commits

Author SHA1 Message Date
woxcab
fbb411a805 Allowed passing objects of Mapping class or its subclass to the CaselessDict initializer 2017-03-13 14:16:39 +03:00
Paul Tremberth
871134ee22 Refactor to also test FilesPipeline 2017-03-12 17:30:24 +01:00
Paul Tremberth
708f1b009b Add integration tests for MEDIA_ALLOW_REDIRECTS 2017-03-10 21:36:33 +01:00
Paul Tremberth
7dcc86e61a Add file listing resource + redirecting resource to MockServer 2017-03-10 21:35:25 +01:00
Mikhail Korobov
9c69e90056 Merge pull request #2632 from redapple/spider-loader-warn-or-fail
[MRG] Add SPIDER_LOADER_WARN_ONLY to toggle between spiderloader failure or warning
2017-03-09 23:01:27 +05:00
Paul Tremberth
c0cbaccb7b Merge pull request #2581 from lopuhin/respect-custom-log-level
[MRG+1] Respect custom log level: fixes GH-1612
2017-03-09 12:47:20 +01:00
Paul Tremberth
9cfe9ae098 Do not use self.assertRaises() as context manager 2017-03-09 12:21:03 +01:00
Paul Tremberth
7be773e14a Add SPIDER_LOADER_WARN_ONLY to toggle between spiderloader failure and warning 2017-03-07 17:40:40 +01:00
Mikhail Korobov
d3f8f3d38a Merge pull request #2612 from redapple/dupe-spider-name-tests
[WIP] Add warning on duplicate spider name
2017-03-07 20:08:42 +05:00
Eugenio Lacuesta
c7bb2fa8ce Feed exports: consistent and backwards compatible behaviour on indent 2017-03-07 11:56:00 -03:00
Eugenio Lacuesta
766b2c8453 Feed exports: enforce difference between None and 0 on indent
Also rename params and settings from "indent_width" to just "indent"
2017-03-07 11:56:00 -03:00
Eugenio Lacuesta
7e9153b38d Feed exports: beautify JSON and XML 2017-03-07 11:56:00 -03:00
Paul Tremberth
0b9a18e1a1 Warn only once for all spiders 2017-03-07 15:41:17 +01:00
Paul Tremberth
978306a223 Fix dupe spider name warning string tests 2017-03-07 14:48:16 +01:00
Paul Tremberth
b6378c7ef6 Revert to using self.assert methods 2017-03-07 12:28:24 +01:00
Konstantin Lopuhin
6f55ca4643 Revert unneeded test_crawl changes 2017-03-07 14:20:52 +03:00
Paul Tremberth
11cdf58abe Always decompress Content-Encoding: gzip at HttpCompression stage
Let SitemapSpider handle decoding of .xml.gz files if necessary
2017-03-07 11:02:46 +01:00
Paul Tremberth
e42b846a9f Use body to chose response type after decompression content 2017-03-06 23:10:38 +01:00
Paul Tremberth
768f3155e5 Fix referrer policy from response headers and support explicit empty string 2017-03-06 16:20:37 +01:00
Paul Tremberth
0a17f9b55c Merge pull request #2334 from ArturGaspar/data_uri
[MRG+1] data URI download handler.
2017-03-06 13:50:46 +01:00
Paul Tremberth
c68f99eed8 Refactor settings tests 2017-03-03 17:03:25 +01:00
Konstantin Lopuhin
ef04cfd237 Respect log settings in custom_settings: fixes GH-1612
A new root logger is installed when a crawler is created
if one was already installed before.
This allows to respect custom settings related to logging,
such as LOG_LEVEL, LOG_FILE, etc.
2017-03-03 18:01:11 +03:00
Paul Tremberth
f7e11b198e Cleanup 2017-03-03 16:00:59 +01:00
Paul Tremberth
ecde166ee1 Refactor without MEDIA_HTTPSTATUS_LIST setting 2017-03-03 15:52:05 +01:00
Bernardas
11b31c9fbd fix redirect change 2017-03-03 15:52:05 +01:00
Bernardas
3cef1cd451 adjust variable wording and redirect logic 2017-03-03 15:52:05 +01:00
Bernardas
6a42214716 add tests for media pipeline MEDIA_ALLOW_REDIRECTS and MEDIA_HTTPSTATUS_LIST settings 2017-03-03 15:52:05 +01:00
Mikhail Korobov
7e8453cf1e Merge pull request #2306 from redapple/referrer-policy
[MRG] Referrer policies in RefererMiddleware
2017-03-03 04:05:13 +05:00
Mikhail Korobov
93024c242b Merge pull request #2537 from scrapy/no-canonicalize
[MRG+1] Set canonicalize=False for LinkExtractor
2017-03-03 02:53:36 +05:00
Artur Gaspar
b50d0370f4 Test response attributes in data URI download handler. 2017-03-02 14:46:33 -03:00
Paul Tremberth
12a8ddecab Fix tests 2017-03-02 13:03:18 +01:00
Paul Tremberth
e71803c833 Add tests for duplicate spider name warnings 2017-03-02 12:48:47 +01:00
Paul Tremberth
2d55d838ca Fix strip_url() tests 2017-03-01 20:59:52 +01:00
Paul Tremberth
efa50039ec Add tests for policy fallback on unknown policies from meta and headers 2017-03-01 17:51:23 +01:00
Paul Tremberth
8226e77010 Add test for Referer header on HTTP redirections 2017-03-01 17:51:23 +01:00
Paul Tremberth
d2aa51c0fb Update tests 2017-03-01 17:51:23 +01:00
Paul Tremberth
bc200d1155 Rename setting to REFERRER_POLICY (with 2 Rs) 2017-03-01 17:51:23 +01:00
Paul Tremberth
b6c761d2b4 Fix tests 2017-03-01 17:50:39 +01:00
Paul Tremberth
5cef67ae75 Update Referrer tests for "strict-" policies 2017-03-01 17:50:39 +01:00
Paul Tremberth
0a0b60a59f Add tests for stripping userinfo with percent-encoded delimiters 2017-03-01 17:50:39 +01:00
Paul Tremberth
8864d0e8c1 Rename helper function to strip_url() + add more tests 2017-03-01 17:50:39 +01:00
Paul Tremberth
5dd7311cd4 Move URL credentials stripping to a helper function 2017-03-01 17:50:39 +01:00
Paul Tremberth
d3d4d66ce8 Add tests for referrer-policy set in response HTTP headers 2017-03-01 17:50:39 +01:00
Paul Tremberth
e50e670eff Add test for custom referrer policy via settings 2017-03-01 17:50:39 +01:00
Paul Tremberth
0344f57fef Support case-insensitive policy names in settings 2017-03-01 17:50:39 +01:00
Paul Tremberth
e72b6e3361 Add tests for referrer policy via settings and via Request meta 2017-03-01 17:50:39 +01:00
Paul Tremberth
7ec1b5f6c3 Add tests for the different referrer policies 2017-03-01 17:50:38 +01:00
Paul Tremberth
baed7c436f WIP Add Referrer policies 2017-03-01 17:50:38 +01:00
Rolando Espinoza
f01ae6ffcd Handle data loss gracefully.
Websites that return a wrong ``Content-Length`` header may cause a data
loss error. Also when a chunked response is not finished properly.

This change adds a new setting ``DOWNLOAD_FAIL_ON_DATALOSS`` (default:
``True``) and request.meta key ``download_fail_on_dataloss``.
2017-03-01 11:43:53 -03:00
Mikhail Korobov
0e5ed21397 Merge pull request #2599 from redapple/py3-ignores-cleanup-ftp
Fix FTP downloader and re-enable FTP tests on Python 3
2017-02-28 15:39:42 +05:00