1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-24 22:04:16 +00:00

967 Commits

Author SHA1 Message Date
Paul Tremberth
ecde166ee1 Refactor without MEDIA_HTTPSTATUS_LIST setting 2017-03-03 15:52:05 +01:00
Bernardas
8542780854 typo and clarify handling 2017-03-03 15:52:05 +01:00
Bernardas
25ed491219 add description for media pipeline MEDIA_ALLOW_REDIRECTS and MEDIA_HTTPSTATUS_LIST settings 2017-03-03 15:52:05 +01:00
Mikhail Korobov
7e8453cf1e Merge pull request #2306 from redapple/referrer-policy
[MRG] Referrer policies in RefererMiddleware
2017-03-03 04:05:13 +05:00
Mikhail Korobov
93024c242b Merge pull request #2537 from scrapy/no-canonicalize
[MRG+1] Set canonicalize=False for LinkExtractor
2017-03-03 02:53:36 +05:00
Paul Tremberth
bc200d1155 Rename setting to REFERRER_POLICY (with 2 Rs) 2017-03-01 17:51:23 +01:00
Paul Tremberth
537683f945 Add autoclass directives to document built-in policies 2017-03-01 17:51:23 +01:00
Paul Tremberth
3dc09eeceb Use table for referrer policy options 2017-03-01 17:51:23 +01:00
Paul Tremberth
605935f015 Edit text 2017-03-01 17:51:23 +01:00
Paul Tremberth
eb07285a63 Reword warning on no-referrer-when-downgrade policy 2017-03-01 17:51:23 +01:00
Paul Tremberth
03ff19d188 Update docs for new "referrer_policy" Request.meta key 2017-03-01 17:51:23 +01:00
Paul Tremberth
e249abc32b Update docs 2017-03-01 17:50:39 +01:00
Paul Tremberth
c86f568b9c Update docs with "strict-..." policies 2017-03-01 17:50:39 +01:00
Paul Tremberth
c9c59db489 Update documentation about REFERER_POLICY setting 2017-03-01 17:50:39 +01:00
Mikhail Korobov
7b49b9c0f5 Merge pull request #2590 from rolando-contrib/handle-data-loss-gracefully
[MRG+2] Handle data loss gracefully.
2017-03-01 20:23:19 +05:00
Rolando Espinoza
f01ae6ffcd Handle data loss gracefully.
Websites that return a wrong ``Content-Length`` header may cause a data
loss error. Also when a chunked response is not finished properly.

This change adds a new setting ``DOWNLOAD_FAIL_ON_DATALOSS`` (default:
``True``) and request.meta key ``download_fail_on_dataloss``.
2017-03-01 11:43:53 -03:00
MikeinRealLife
441f25507e fixed typo
removed duplicate line
2017-02-22 21:23:27 -08:00
MikeinRealLife
96a570a93a fixed ticket #2574 2017-02-22 21:17:34 -08:00
Mikhail Korobov
47f7da8724 canonicalize=False by default for LinkExtractor. Fixes GH-1941. 2017-02-20 22:58:11 +05:00
Mikhail Korobov
93e449f1f6 Merge pull request #2343 from redapple/anonymous-ftp
[MRG+1] Support Anonymous FTP
2017-02-20 23:19:54 +06:00
Paul Tremberth
f3a7567443 Add note on FTP_PASSWORD default value 2017-02-20 17:15:05 +01:00
Omer Schleifer
ff3e299eb0 [MRG+2] add flags to request (#2082)
* add flags to request

* fxi test - add flags to request

* fix test(2) - add flags to request

* fix test(2) - add flags to request

* Updated test to reqser with flags field of request

* Updated documntation with flags field of request

* fix test identation

* fix test failed

* make the change backward comptaible

* remove  unrequired  spaces, fix documentation request flags

* remove  unrequired  space

* fx assert equal

* flags default is empty list

* Add flags to request

* add flags to request

* fxi test - add flags to request

* fix test(2) - add flags to request

* fix test(2) - add flags to request

* Updated test to reqser with flags field of request

* Updated documntation with flags field of request

* fix test identation

* fix test failed

* make the change backward comptaible

* remove  unrequired  spaces, fix documentation request flags

* remove  unrequired  space

* fx assert equal

* flags default is empty list

* add flags to request squashed commits
2017-02-20 20:42:29 +06:00
Daniel Graña
b0388e49b4 Merge pull request #1728 from scrapy/deprecate-make-requests-from-url
deprecate Spider.make_requests_from_url.
2017-02-20 11:23:49 -03:00
Daniel Graña
c68140e68a Merge pull request #2540 from scrapy/response-follow
response.follow
2017-02-20 11:21:21 -03:00
Daniel Graña
e578be73b6 Merge pull request #2539 from scrapy/enable-memusage
Enable memusage extension by default.
2017-02-20 11:18:04 -03:00
Daniel Graña
fab168bbfb Merge pull request #2572 from advarisk/warning-formrequest-from-response
[MRG+1] document issue with FormRequest.from_response due to bug in lxml
2017-02-20 11:14:28 -03:00
Daniel Graña
85ef6a6229 Merge pull request #2564 from elacuesta/docs_exporters
[MRG+1] Doc: binary mode is required for exporters
2017-02-20 11:12:48 -03:00
Paul Tremberth
d35a01a103 Update default password 2017-02-20 14:23:23 +01:00
Paul Tremberth
b80e1bb6c5 Document new FTP_* settings 2017-02-20 14:19:36 +01:00
Ashish Kulkarni
165e2cb8c9 document issue with FormRequest.from_response due to bug in lxml
This can make the spider fail due to incorrect values being posted
server-side, which is extremely hard to debug because it is easy
to miss leading/trailing whitespace, even with a logging proxy.

The fix was merged for lxml 3.8 in lxml/lxml#228 so document that
as well.
2017-02-17 14:54:22 +05:30
Mikhail Korobov
692975acb4 deprecate Spider.make_requests_from_url. Fixes #1495. 2017-02-16 03:39:34 +05:00
Mikhail Korobov
d09eed7674 use w3lib.html.strip_html5_whitespace function; expand docs; strip consistently before calling process_value 2017-02-16 02:22:18 +05:00
Mikhail Korobov
d079e15fe2 Strip leading/trailing whitespaces in link extractors. Fixes GH-838. 2017-02-16 02:22:17 +05:00
Mikhail Korobov
5b79c6a679 DOC document response.follow methods; expand the tutorial 2017-02-16 00:06:52 +05:00
Mikhail Korobov
877057fac0 initial response.follow implementation 2017-02-15 01:22:53 +05:00
Eugenio Lacuesta
922d3fec54 Doc: binary mode is required for exporters 2017-02-14 12:51:03 -03:00
Eugenio Lacuesta
ae0ea31abd Add HTTPPROXY_ENABLED setting (default True) 2017-02-14 11:33:01 -03:00
Eugenio Lacuesta
9c0aae724e Use credentials from request.meta['proxy'] if present 2017-02-08 13:22:30 -03:00
Paul Tremberth
d205206aaa Merge pull request #2345 from gustavodeandrade/master
[MRG+1] Fix documentation about HTML entities decoding with selector extraction
2017-02-08 13:21:15 +01:00
Paul Tremberth
7d0b89042f Merge pull request #2533 from djrobust/patch-1
[MRG+1] Use yield with nested parsing of responses
2017-02-08 13:02:50 +01:00
Mikhail Korobov
85a124970a Enable memusage extension by default. Fixes GH-2187. 2017-02-07 03:32:54 +05:00
Mikhail Korobov
3b8e6d4d82 Merge pull request #2531 from Lukas0907/patch-1
Fix typo in downloader-middleware.rst.
2017-02-06 13:09:35 +05:00
Takehiro Shiozaki
fcb3daf4fa fix typo 2017-02-06 14:03:41 +09:00
djrobust
3021084f37 Use 'yield' when parsing multiple responses
Use 'yield' consistently across examples of parse functions.
2017-02-04 20:07:05 -08:00
Lukas Anzinger
09643796b4 Fix typo in downloader-middleware.rst. 2017-02-03 20:05:17 +01:00
Paul Tremberth
1c0b805357 DOC Mention XPath variables in Selectors section 2017-02-02 17:44:57 +01:00
Mikhail Korobov
c305c46254 Merge pull request #2503 from redapple/view-no-redirect
[MRG] Fix view command against new --no-redirect option
2017-02-02 21:24:56 +05:00
Mikhail Korobov
55742c0392 DOC mention LevelDB cache storage backend 2017-02-01 22:43:28 +05:00
Paul Tremberth
4156a86148 Update docs on view command 2017-01-30 15:57:37 +01:00
Raul Gallegos
df1a42419f adding formid to FormRequest documentation 2017-01-14 20:45:20 -05:00