Mikhail Korobov
679a680638
Merge pull request #1933 from scrapy/cert-verif-ignore
...
Ignore HTTPS certificate verification failures
2016-04-20 18:57:33 +06:00
Paul Tremberth
cd979ace40
Add HTTPS tests with non-hostname-maching server certificate
2016-04-20 14:42:03 +02:00
Mikhail Korobov
3735eb8e5f
Merge pull request #1912 from redapple/https-proxy-pool-key
...
[MRG+1] Fix HTTP Pool key for HTTPS proxy tunneled connections (CONNECT method)
2016-04-20 17:32:04 +06:00
Mikhail Korobov
a8b49472b5
Merge pull request #1938 from redapple/https-proxy-connect-sni
...
Set SNI properly when using CONNECT
2016-04-20 17:30:13 +06:00
Paul Tremberth
dcea11a70c
Fall back to no-SNi context factory is Twisted<14 is used
2016-04-19 10:41:13 +02:00
Paul Tremberth
d6760dbaac
Set SNI properly when using CONNECT
2016-04-18 18:30:01 +02:00
Paul Tremberth
25ee023561
Catch VerificationError but keep the rest of ClientTLSOptions
2016-04-14 17:19:55 +02:00
Paul Tremberth
a087d2593a
Ignore HTTPS certificate verification failures
...
Fixes #1930
2016-04-14 15:59:41 +02:00
Mikhail Korobov
ba6dbad1e0
Merge pull request #1926 from redapple/faq-mcve
...
Reference StackOverflow's "minimal, complete, and verifiable example" guide
2016-04-12 18:35:34 +06:00
Paul Tremberth
2849ebf4c6
Reference StackOverflow's "minimal, complete, and verifiable example" guide
2016-04-12 14:07:33 +02:00
Paul Tremberth
47bfac1669
Merge pull request #1924 from lopuhin/faq-fix-py3
...
[MRG+1] Fix FAQ entry about python versions support (add Python 3.3+)
2016-04-12 12:15:17 +02:00
Konstantin Lopuhin
1ec49c2ada
Fix FAQ entry about python versions support
2016-04-12 11:48:57 +03:00
Paul Tremberth
86e4442d17
Fix HTTP Pool key for HTTPS proxy tunneled connections (CONNECT method)
2016-04-11 18:28:38 +02:00
Paul Tremberth
10d03ee419
Merge pull request #1916 from nblock/patch-1
...
Fix spelling mistake
2016-04-11 15:12:24 +02:00
nblock
a3557dd34d
Fix spelling mistake
2016-04-11 14:06:57 +02:00
Mikhail Korobov
ff80e1c381
Merge pull request #1913 from redapple/link-extractor-new-w3lib
...
Fix link extractor tests for non-ASCII characters from latin1 document
2016-04-09 22:05:59 +06:00
Paul Tremberth
7b5243a263
Add link extractor test for non-ASCII characters in query part of URL
2016-04-09 15:15:01 +02:00
Paul Tremberth
1656fbcffa
Fix link extractor tests for non-ASCII characters from latin1 document
...
URL path component should use UTF-8 before percent-encoding (that's what
browsers do when you open scrapy/tests/sample_data/link_extractor/linkextractor_latin1.html
and follow the links)
This matches current w3lib v1.14.1
2016-04-08 23:25:50 +02:00
Paul Tremberth
0ede017d2a
Merge pull request #1891 from djunzu/update_files_images_pipelines
...
[MRG+1] Change Files/ImagesPipelines class attributes to instance attributes
2016-04-08 12:55:09 +02:00
Paul Tremberth
cbb695d08c
Merge pull request #1881 from nyov/dedupe
...
[MRG+1] Remove duplicate code now handled by newer w3lib
2016-04-06 15:47:06 +02:00
Paul Tremberth
642fedb3d6
Merge pull request #1902 from starrify/case-insensitive-robots-txt-for-sitemap
...
[MRG+1] Added: Making it case-insensitive when extracting sitemap URLs from a robots.txt
2016-04-04 15:23:03 +02:00
djunzu
6988e9cd4b
Update docs.
...
modified: docs/topics/media-pipeline.rst
2016-04-01 21:51:15 -03:00
Pengyu CHEN
103f6eaa88
Added: Making it case-insensitive when extracting sitemap URLs from a robots.txt
2016-04-02 02:04:50 +08:00
Paul Tremberth
bf7f675493
Merge pull request #1847 from aron-bordin/add_blocking_storage_path_setting
...
[MRG+2] added BLOCKING_FEED_STORAGE_PATH to settings
2016-04-01 15:47:06 +02:00
Aron Bordin
9250a5bffa
added FEED_TEMPDIR to settings
2016-04-01 00:05:21 -03:00
djunzu
537083524e
Change ImagesPipeline class attributes to instance attributes.
...
modified: scrapy/pipelines/images.py
2016-03-31 19:20:43 -03:00
djunzu
8228a0c491
Change FilesPipeline class attributes to instance attributes.
...
modified: scrapy/pipelines/files.py
modified: tests/test_pipeline_files.py
2016-03-31 19:20:39 -03:00
djunzu
c7fc17866f
Move default settings to settings/default_settings.py.
...
modified: scrapy/pipelines/files.py
modified: scrapy/pipelines/images.py
modified: scrapy/settings/default_settings.py
2016-03-31 19:20:33 -03:00
djunzu
e9d48f8a8e
Add tests.
...
modified: tests/test_pipeline_files.py
modified: tests/test_pipeline_images.py
2016-03-31 19:19:49 -03:00
Paul Tremberth
9d8c368ce8
Merge pull request #1879 from scrapy/scrapy-arch-docs
...
DOC improved Architecture overview
2016-03-31 12:09:24 +02:00
Paul Tremberth
9ae4e46f32
Merge pull request #1883 from lopuhin/botocore-files-store-fix
...
[MRG+1] Make FilesPipeline work with S3FilesStore using botocore
2016-03-31 11:57:39 +02:00
Paul Tremberth
3ba5671fbc
Merge pull request #1851 from nyov/binary_or_text
...
[MRG+1] Rename isbinarytext function to binary_is_text for clarity
2016-03-31 11:55:09 +02:00
Paul Tremberth
3a763f7ba7
Merge pull request #1857 from pawelmhm/fix_response_status_msg
...
[MRG+1] response_status_message should not fail on non-standard HTTP codes
2016-03-31 11:44:44 +02:00
nyov
e8ca467572
Rename isbinarytext function to binary_is_text for clarity
...
Closes #1389
2016-03-30 15:44:15 +00:00
nyov
3787fec460
Remove duplicate code now handled by newer w3lib
...
see f3029a6a10
2016-03-30 14:58:03 +00:00
Mikhail Korobov
a38a99e0e2
Merge pull request #1893 from redapple/sphinx-1.4
...
Add support for Sphinx 1.4
2016-03-30 19:55:48 +06:00
Paul Tremberth
1075587dbd
Add support for Sphinx 1.4
...
See http://www.sphinx-doc.org/en/stable/changes.html#release-1-4-released-mar-28-2016
sphinx_rtd_theme has become optional, needs to be added to reqs
https://github.com/sphinx-doc/sphinx/pull/2320 changes node entries tuples
to 5 values instead of 4
`sh` syntax highlighting added very locally in selectors.rst
because of this warning/error with Sphinx 1.4:
```
Warning, treated as error:
/home/paul/src/scrapy/docs/topics/selectors.rst:743:
WARNING: Could not lex literal_block as "python". Highlighting skipped.
```
2016-03-30 14:40:52 +02:00
nanolab
a583e4d531
Update httpcache.py
...
It checks cache directory modification time, but have to check file modification time.
2016-03-30 10:57:48 +02:00
Lele
7082454f2a
Changed sel. to response. for clarity
...
Changed sel. to response. to comply with the rest of the examples in the same section, to avoid confusion.
2016-03-28 05:27:15 +05:00
Konstantin Lopuhin
fc8cd45a48
Fix a race condition in the FilesPipeline
...
Checksum calculation could happen simultaniously with
persisting the file in the store (which is done in a thread):
they operated on the same buf object.
Concretely this lead to a bug with S3FilesStore
when using botocore: the signature did not match because
the position in the buf was already at the end.
The fix is to move checksum calculation before passing buf
to the store.
2016-03-27 21:56:47 +02:00
Konstantin Lopuhin
5045a4f168
Fix handling of meta=None in S3FilesStore.persist_file
2016-03-25 18:35:55 +03:00
Mikhail Korobov
4f335b5a01
DOC clarify Architecture docs
2016-03-25 17:03:41 +05:00
Mikhail Korobov
3ca977a8cb
DOC improved Architecture overview
...
* spiders don't have to work on specific domains;
* explain what to use Downloader middleware for
and what to use Spider middleware for;
* Engine no longer locates spiders based on domains;
* "Spider middleware output direction" step was missing.
See also: GH-1569.
2016-03-25 07:11:33 +05:00
pawelmhm
65c7c05060
response_status_message should not fail on non-standard HTTP codes
...
utility is used in retry middleware and it was failing to handle non-standard HTTP codes.
Instead of raising exceptions when passing through to_native_str it should return
"Unknown status" message.
2016-03-12 14:16:40 +01:00
Mikhail Korobov
ebef6d7c6d
Merge pull request #1848 from aron-bordin/small_doc_style_fixes
...
small doc style fixes
2016-03-07 08:49:25 +05:00
Aron Bordin
2cfe9e424d
small doc style fixes
2016-03-05 19:54:06 -03:00
Paul Tremberth
e122c569fe
Merge pull request #1842 from nyov/nyov/docs
...
[MRG+1] Update documentation links
2016-03-04 11:50:26 +01:00
nyov
5876b9aa30
Update documentation links
2016-03-03 16:28:33 +00:00
Paul Tremberth
9f4fe5dc4a
Merge pull request #1822 from nyov/nyov/scheduler
...
[MRG+1] Allow core Scheduler priority queue customization
2016-03-02 14:20:40 +01:00
Mikhail Korobov
6b2871dadd
Merge pull request #1835 from djunzu/add_pps_to_IGNORED_EXTENSIONS
...
[MRG+1] Add pps extension to IGNORED_EXTENSIONS
2016-03-02 16:15:51 +05:00