1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-25 06:23:58 +00:00

6010 Commits

Author SHA1 Message Date
Elias Dorneles
a9c74dbe42 Merge pull request #2339 from visued/master
[MRG+1] Update documentation to explain the use of double quotes on Windows (fixes #2325)
2016-10-25 09:53:24 -02:00
Paul Tremberth
bd4f156d16 Merge pull request #2354 from stav/doc-spider-arguments
[MRG+1] doc: wording
2016-10-24 12:42:59 +02:00
Steven Almeroth
99daea495b Doc: wording 2016-10-21 16:16:42 -07:00
Steven Almeroth
a958e54954 Doc: remove trailing spaces 2016-10-21 16:16:37 -07:00
Mikhail Korobov
1be2447af9 Merge pull request #2346 from redapple/master-pr2313
Fix tutorial AuthorSpider
2016-10-21 11:08:17 +02:00
Randy Pen
c7d245b90b update
Thx for your advice.
2016-10-21 10:52:50 +02:00
Randy Pen
eacc5937e4 fix example code
In the AuthorSpider, original css selector failed to get links of author pages
2016-10-21 10:52:50 +02:00
Paul Tremberth
6df48d57e0 Bump version: 1.2.0 → 1.2.1 1.2.1 2016-10-21 10:37:45 +02:00
Paul Tremberth
f0e935735f Merge pull request #2341 from redapple/release-notes-1.2.1
Update release notes for upcoming 1.2.1
2016-10-21 10:34:47 +02:00
Paul Tremberth
c559a64c20 Set date for 1.2.1 release 2016-10-21 10:17:46 +02:00
Paul Tremberth
22772a61c8 Update release notes for upcoming 1.2.1 2016-10-21 10:17:04 +02:00
Paul Tremberth
eb5d396527 Merge pull request #2344 from jaympatel/master
Typo (through was misspelled)
2016-10-20 22:24:14 +02:00
Jay Patel
f357ccd0d7 Typo (through was misspelled) 2016-10-20 15:52:28 -04:00
Mikhail Korobov
871eec9827 Merge pull request #2327 from bopace/http-header-docs
[MRG+1] Added documentation about accessing header values
2016-10-20 09:05:14 +02:00
Mikhail Korobov
71c8278f57 Merge pull request #2329 from josericardo/scrapy-2262
[MRG+1] Better explain middleware orders and processing directions
2016-10-20 09:04:19 +02:00
Paul Tremberth
db40852892 Do not interpret non-ASCII bytes in "Location" and percent-encode them (#2322)
* Do not interpret non-ASCII bytes in "Location" and percent-encode them

Fixes GH-2321

The idea is to not guess the encoding of "Location" header value
and simply percent-encode non-ASCII bytes,
which should then be re-interpreted correctly by the remote website
in whatever encoding was used originally.

See https://tools.ietf.org/html/rfc3987#section-3.2

This is similar to the changes to safe_url_string in
https://github.com/scrapy/w3lib/pull/45

* Remove unused import
2016-10-19 23:26:12 -03:00
muriloviana
09c401bf8e use crawler only to register signals in from_crawler() 2016-10-19 14:09:01 -02:00
muriloviana
32fd692810 define method spider_opened and update class instructions 2016-10-19 13:50:09 -02:00
muriloviana
38e292a132 add from_crawler method to template and turn the returns methods explicit 2016-10-19 12:47:23 -02:00
muriloviana
34f2014c55 change settings middleware name and updating middleware template 2016-10-19 00:37:31 -02:00
muriloviana
d5bd44a5b9 add middleware to template project 2016-10-18 17:29:16 -02:00
Victor Sued
f74051e69e update documentation to explain the use of double quotes on Windows. 2016-10-18 16:36:43 -02:00
bopace
fd016ee71b Fixed wording of documentation 2016-10-18 09:37:45 -06:00
Moisés Guimarães
3fb6e52457 fixes import for py35 env. 2016-10-18 12:24:11 -03:00
Paul Tremberth
6cc83c041e Merge pull request #2330 from lfmattossch/note-python2-name
[MRG+1] Added a note about invalid spider names in python 2 (fixes #2251)
2016-10-18 16:35:19 +02:00
Jose Ricardo
e12e364a40 Add details to the spider middlewares docs
Document the effects of the middleware order in a more detailed way.
2016-10-18 12:29:30 -02:00
Mikhail Korobov
a5f4450313 Merge pull request #2302 from Granitosaurus/pipeline_doc_fix
[MRG+1] Fix JsonWriterPipeline example in docs
2016-10-18 16:20:59 +02:00
Moisés Guimarães
45e95b79ce (fixes #2272) using arg_to_iter() to wrap single values and list() to avoid consuming from generators. 2016-10-18 11:06:55 -03:00
Luiz Fernando Mattos Schlindwein
7c33e0cb55 added a note about invalid spider names in python 2 2016-10-18 11:55:51 -02:00
Jose Ricardo
ea7bd39529 Make architecture overview references a little more clear on the docs
Expliciting what actually happens by adding links to the respective methods
that are invoked in each processing phase.
2016-10-18 11:50:51 -02:00
Jose Ricardo
bebcd5081c Add downloader middleware ordering details to the docs
Add more details, making it easier to understand what are the effects of
setting a downloader middleware order.
2016-10-18 11:22:55 -02:00
Bo Pace
bfe28ae707 Added documentation about accessing header values 2016-10-17 14:10:05 -06:00
Mikhail Korobov
7e20725eb7 Merge pull request #2314 from redapple/tls-ciphers
[MRG+1] Use OpenSSL default ciphers
2016-10-17 11:44:28 +02:00
Paul Tremberth
dc1f9ad211 Merge pull request #2307 from eLRuLL/genspider-no-www-fix
genspider: removing www. from starturl templates
2016-10-12 15:32:18 +02:00
Raul Gallegos
118b42ab59 making start_urls a list in basic genspider template 2016-10-11 22:08:05 -05:00
Paul Tremberth
c3411373e8 Use OpenSSL default ciphers
Twisted default TLS options restricts the ciphers list a bit -- "a secure default"
e38cc25a67/src/twisted/internet/_sslverify.py (L1861)

We want to be a bit more permissive with Scrapy
(at least as permissive as Scrapy 1.0 was, and which used a default OpenSSL Context)

See https://www.openssl.org/docs/manmaster/apps/ciphers.html#CIPHER_STRINGS

OpenSSL's 'DEFAULT' seems to be reasonable enough.

Fixes #2311
2016-10-07 12:28:16 +02:00
Raul Gallegos
dd778892d0 genspider: removing www. from starturl templates 2016-10-05 12:16:45 -05:00
Bernardas
dfba151f59 Remove unnecessary note for the JsonWriterPipeline example 2016-10-05 16:36:02 +00:00
Bernardas
eb91cb8ea2 fix JsonWriterPipeline example 2016-10-03 20:31:41 +00:00
Paul Tremberth
3235bfeb1e Bump version: 1.2.0dev2 → 1.2.0 1.2.0 2016-10-03 13:04:11 +02:00
Paul Tremberth
95c6b9dffd Merge pull request #2280 from scrapy/release-notes-1.2
Update release notes for upcoming 1.2.0
2016-10-03 13:00:14 +02:00
Paul Tremberth
e2137d77ce Add release date for scrapy 1.2 2016-10-03 12:41:18 +02:00
Paul Tremberth
a433126301 Merge pull request #2294 from redapple/release-headers
Change release section titles to have correct links in ToC
2016-10-03 12:38:44 +02:00
Paul Tremberth
32c46453c3 Merge pull request #2300 from mineo/patch-3
Remove duplicate colons from the feed export settings docs
2016-10-03 12:34:28 +02:00
Wieland Hoffmann
e8edc6c2bb Remove duplicate colons from the feed export settings docs 2016-10-02 16:09:29 +02:00
Daniel Graña
5d38a8e2d9 Merge pull request #2298 from scrapy/update-1.2-relnotes-data-dir-util
Update 1.2 release notes with data_path changes from #1581
2016-09-30 15:23:55 -03:00
Daniel Graña
bca374d651 Merge pull request #1581 from scrapy/fix-util-function-to-work-outside-project-dir
[MRG+1] Make data_path work when outside project (used by HttpCacheMiddleware and Deltafetch plugin)
2016-09-30 15:23:34 -03:00
Elias Dorneles
2d932c173c test abs path outside project as well 2016-09-30 15:07:58 -03:00
Paul Tremberth
fed53c1e28 Merge pull request #2267 from scrapy/deprecate-ubuntu-packages
[MRG+1] Deprecate official Ubuntu packages and update installation instructions
2016-09-30 17:35:29 +02:00
Elias Dorneles
559b4edaec update release notes with changes from #1581 2016-09-30 11:42:50 -03:00