1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-25 20:04:28 +00:00

479 Commits

Author SHA1 Message Date
Paul Tremberth
0e11b3e6f0 Add idempotence tests for canonicalize_url 2016-04-26 20:03:17 +02:00
Paul Tremberth
68dedf54cb Fix canonicalize_url() on Python 3 and re-enable tests 2016-04-21 14:50:59 +02:00
Mikhail Korobov
73a5571c60 Merge pull request #1923 from scrapy/request-new-safe_url_string
[MRG+1] Use newer w3lib.url.safe_url_string() and re-enable HTTP request tests
2016-04-20 19:33:45 +06:00
Paul Tremberth
cd979ace40 Add HTTPS tests with non-hostname-maching server certificate 2016-04-20 14:42:03 +02:00
Paul Tremberth
d42a98d3b5 Use newer w3lib.url.safe_url_string() and re-enable HTTP request tests 2016-04-12 00:33:25 +02:00
Paul Tremberth
7b5243a263 Add link extractor test for non-ASCII characters in query part of URL 2016-04-09 15:15:01 +02:00
Paul Tremberth
1656fbcffa Fix link extractor tests for non-ASCII characters from latin1 document
URL path component should use UTF-8 before percent-encoding (that's what
browsers do when you open scrapy/tests/sample_data/link_extractor/linkextractor_latin1.html
and follow the links)
This matches current w3lib v1.14.1
2016-04-08 23:25:50 +02:00
Paul Tremberth
0ede017d2a Merge pull request #1891 from djunzu/update_files_images_pipelines
[MRG+1] Change Files/ImagesPipelines class attributes to instance attributes
2016-04-08 12:55:09 +02:00
Paul Tremberth
bf7f675493 Merge pull request #1847 from aron-bordin/add_blocking_storage_path_setting
[MRG+2] added BLOCKING_FEED_STORAGE_PATH to settings
2016-04-01 15:47:06 +02:00
Aron Bordin
9250a5bffa added FEED_TEMPDIR to settings 2016-04-01 00:05:21 -03:00
djunzu
8228a0c491 Change FilesPipeline class attributes to instance attributes.
modified:   scrapy/pipelines/files.py
	modified:   tests/test_pipeline_files.py
2016-03-31 19:20:39 -03:00
djunzu
e9d48f8a8e Add tests.
modified:   tests/test_pipeline_files.py
	modified:   tests/test_pipeline_images.py
2016-03-31 19:19:49 -03:00
Paul Tremberth
3ba5671fbc Merge pull request #1851 from nyov/binary_or_text
[MRG+1] Rename isbinarytext function to binary_is_text for clarity
2016-03-31 11:55:09 +02:00
nyov
e8ca467572 Rename isbinarytext function to binary_is_text for clarity
Closes #1389
2016-03-30 15:44:15 +00:00
pawelmhm
65c7c05060 response_status_message should not fail on non-standard HTTP codes
utility is used in retry middleware and it was failing to handle non-standard HTTP codes.
Instead of raising exceptions when passing through to_native_str it should return
"Unknown status" message.
2016-03-12 14:16:40 +01:00
Paul Tremberth
ecddc093a4 Explicitly call Twisted transport stopProducing() on HTTP/1.1 timeouts 2016-02-24 23:04:31 +01:00
Daniel Graña
513ba7a1fb Merge pull request #1800 from redapple/http11-post-content-length
[MRG+1] Add "Content-Length: 0" for body-less HTTP/1.1 POST requests
2016-02-23 15:00:33 -03:00
Elias Dorneles
10bcdb49b0 Merge pull request #1787 from scrapy/improve-errors
[MRG+1] Better tracebacks
2016-02-23 13:16:47 -03:00
Paul Tremberth
6174192564 Add "Content-Length: 0" for body-less HTTP/1.1 POST requests
GH-823 was fixed only for HTTP/1.0 (in GH-1089)
2016-02-20 23:27:04 +01:00
Paul Tremberth
da36b7d386 Merge pull request #1761 from lopuhin/py3-s3-botocore
[MRG+1] Py3 S3 botocore
2016-02-18 15:11:03 +01:00
Paul Tremberth
104027d78d Minor change on quotes
Trying to force Travis CI to build
2016-02-18 11:45:03 +01:00
Konstantin Lopuhin
d61fbcc8b5 Support headers in S3FilesStore.persist_file for botocore 2016-02-18 10:57:02 +03:00
Mikhail Korobov
f766dd0ba8 Preserve tracebacks better. Fixes GH-1760. 2016-02-17 23:07:03 +05:00
Mikhail Korobov
06da7af9e2 TST clean up RunSpiderCommandTest 2016-02-17 23:03:12 +05:00
Paul Tremberth
cabed6f183 More liberal Content-Disposition header parsing
Fixes #1782
2016-02-17 16:55:28 +01:00
Konstantin Lopuhin
d1ecb8cd38 Fix S3TestCase for precise env: we reraise TypeError as NotConfigured in this case 2016-02-15 19:59:48 +03:00
Konstantin Lopuhin
77ebb13684 fix assertRaises for precise env 2016-02-15 19:59:48 +03:00
Konstantin Lopuhin
08bc41cc68 py3: reviewed s3 downloader handlers 2016-02-15 19:59:48 +03:00
Konstantin Lopuhin
3cb7a567ea py3 fix for TestS3FilesStore: checksum is a native string 2016-02-15 19:59:48 +03:00
Konstantin Lopuhin
cfc567f48e botocore support for S3FilesStore 2016-02-15 19:59:48 +03:00
Konstantin Lopuhin
32cd8c9165 add direct test for S3FilesStore 2016-02-15 19:59:48 +03:00
Konstantin Lopuhin
d1470e85a2 S3FeedStorageTest: pass on py3, add some non-ascii content to be sure 2016-02-15 19:59:48 +03:00
Konstantin Lopuhin
3ada45a9bb S3FeedStorageTest: add botocore support, and organize boto/botocore checks 2016-02-15 19:59:47 +03:00
Konstantin Lopuhin
5d2f067458 S3FeedStorageTest: delete key after test 2016-02-15 19:59:47 +03:00
Konstantin Lopuhin
19b2910ad1 Fix assert_aws_environ: check for botocore with boto fallback on PY2 2016-02-15 19:59:47 +03:00
Konstantin Lopuhin
bcb92b50dc check that no extra kwargs are silently discarded 2016-02-15 19:59:47 +03:00
Konstantin Lopuhin
d6bea3bf2e botocore not only does not allow passing our own Date header, but does not handle x-amz-date according to the spec 2016-02-15 19:59:47 +03:00
Konstantin Lopuhin
7748ee6bba mock date in s3 tests when using botocore 2016-02-15 19:59:47 +03:00
Konstantin Lopuhin
c3fec83e7e use botocore by default, boto is still used in "precise" env 2016-02-15 19:59:47 +03:00
Konstantin Lopuhin
467553cc29 fix anon test: in this case we do no signing, just change the url 2016-02-15 19:59:47 +03:00
Konstantin Lopuhin
eaf3a239e4 using botocore for s3 request signing: proof of concept 2016-02-15 19:59:46 +03:00
Paul Tremberth
41588397c0 Merge pull request #1765 from scrapy/add-deprecation-for-pydispatch
[MRG+1] Add fallback and deprecation warning for pydispatch (fixes #1762)
2016-02-11 19:29:53 +01:00
Elias Dorneles
164493df2e add deprecation for pydispatch (thanks for the help @redapple) 2016-02-11 16:15:28 -02:00
Paul Tremberth
c083935806 Merge pull request #1771 from orangain/secure-cookies
[MRG+1] PY3: Implement some attributes of WrappedRequest required in Python 3
2016-02-08 11:52:00 +01:00
orangain
1f743996ff PY3: Implement some attributes of WrappedRequest required in Python 3
This will fix #1770.
2016-02-07 14:19:27 +09:00
orangain
25c56159b8 Fix SitemapSpider to extract sitemap urls from robots.txt properly
This will fix #1766.
2016-02-06 23:54:07 +09:00
Nicolas Pennequin
061c63592a MailSender.send: allow passing a charset.
Resolves Issue #348
2016-02-04 19:33:44 +01:00
Elias Dorneles
a8a6f050e7 Merge pull request #1735 from ArturGaspar/master
[MRG+1] Fix for KeyError in robots.txt middleware
2016-02-03 13:30:06 -02:00
Mikhail Korobov
43a53aca12 Merge pull request #1746 from redapple/shell-settings-logging
[MRG+1] Remove __str__ and __repr__ from settings, introduce copy_to_dict()
2016-02-03 19:59:24 +05:00
stummjr
bb2cf7c0d7 Fixed bug on XMLItemExporter with non-string fields in items 2016-01-30 10:00:06 -02:00