Paul Tremberth
778bed07bf
Let framework handle only HTTP redirects by default for fetch and shell commands
2016-12-07 17:56:13 +01:00
Paul Tremberth
9aefc0a886
Add test for fetch command with redirections disabled
2016-11-24 13:41:51 +01:00
Paul Tremberth
35b655d2f8
Handle redirects transparently by default in shell and fetch
...
Adds --no-status-aware command line option to have previous behaviour
2016-11-24 12:23:22 +01:00
Mikhail Korobov
a07400ce0a
Merge pull request #2396 from redapple/ubuntu-packages-toc
...
[MRG] DOC Remove "Ubuntu" section from sidebar/ToC
2016-11-22 18:21:38 +05:00
Pablo Hoffman
d62776a858
mention scrapoxy in best practices doc
2016-11-16 12:19:32 -03:00
Paul Tremberth
8db8545393
Add "orphan" metadata for Ubuntu packages page
...
As described in http://www.sphinx-doc.org/en/latest/markup/misc.html#metadata
2016-11-16 13:56:58 +01:00
Paul Tremberth
11fe3751cf
DOC Remove "Ubuntu" section from sidebar/ToC
2016-11-16 11:55:09 +01:00
Elias Dorneles
1cbc57faa5
Merge pull request #2337 from muriloviana/add-middleware-template
...
[MRG+1] Add middleware to template project (closes #2335 )
2016-11-15 12:58:40 -02:00
Mikhail Korobov
570e12b5db
Merge pull request #2328 from scrapy/download-latency-meta-docs
...
[MRG+1] Document `download_latency` meta key
2016-11-14 21:24:14 +05:00
Valdir Stumm Junior
7025d6656a
document download_latency meta key
2016-11-14 13:06:18 -02:00
Paul Tremberth
201a16f6d5
Merge pull request #2394 from nyov/fix-le
...
[MRG+1] LinkExtractor PY3 'unicode' type fix
2016-11-14 12:10:14 +01:00
nyov
e8205f6733
LinkExtractor PY3 'unicode' type fix
2016-11-10 17:50:06 +00:00
Paul Tremberth
de89b1b562
Merge pull request #2275 from scrapy/response-css-xpath-message
...
[MRG+1] Add better messages for when response content isn't text (closes #2264 )
2016-11-10 11:38:22 +01:00
Mikhail Korobov
6d6dc7138b
Merge pull request #2387 from rolando-contribute/conda-forge-updates
...
[MRG+1] Update conda channel to conda-forge and show conda version badge.
2016-11-10 00:35:43 +06:00
Rolando Espinoza
76459e1969
Update conda channel to conda-forge and show conda version badge.
2016-11-08 21:30:27 -03:00
Mikhail Korobov
ea83e67796
Merge pull request #2382 from redapple/slot-heartbeat-state-test
...
Test Slot's heartbeat state before stopping it
2016-11-08 18:32:07 +06:00
Paul Tremberth
af2280e695
Update docstring
2016-11-08 13:30:51 +01:00
Paul Tremberth
27456996a9
Add assertion on crawler not running
2016-11-08 11:46:16 +01:00
Paul Tremberth
61efacdd1f
Add testcase for catching exception from open_spider() from pipeline
2016-11-08 11:35:42 +01:00
Paul Tremberth
7727d87f64
Test Slot's heartbeat state before stopping it
...
Also add a test on state of looping task in LogStats extension
Fixes #2011 and #2362
2016-11-07 16:44:57 +01:00
Mikhail Korobov
0f5eb4cf88
Merge pull request #2380 from rahulkant13may/master
...
Document update in "Using Firefox for scraping"
2016-11-07 19:02:53 +06:00
Rahul Kant
f56aef99c2
Add closing tag in <tbody>
2016-11-07 17:49:22 +05:30
Mikhail Korobov
9755ef933a
Merge pull request #2369 from jc0n/media-pipeline-docs-typo
...
Fix typo in media pipeline docs
2016-10-29 12:46:48 +06:00
John O'Connor
d85da273be
Fix typo in media pipeline docs
2016-10-28 19:44:46 -07:00
Elias Dorneles
a9c74dbe42
Merge pull request #2339 from visued/master
...
[MRG+1] Update documentation to explain the use of double quotes on Windows (fixes #2325 )
2016-10-25 09:53:24 -02:00
Paul Tremberth
bd4f156d16
Merge pull request #2354 from stav/doc-spider-arguments
...
[MRG+1] doc: wording
2016-10-24 12:42:59 +02:00
Steven Almeroth
99daea495b
Doc: wording
2016-10-21 16:16:42 -07:00
Steven Almeroth
a958e54954
Doc: remove trailing spaces
2016-10-21 16:16:37 -07:00
Mikhail Korobov
1be2447af9
Merge pull request #2346 from redapple/master-pr2313
...
Fix tutorial AuthorSpider
2016-10-21 11:08:17 +02:00
Randy Pen
c7d245b90b
update
...
Thx for your advice.
2016-10-21 10:52:50 +02:00
Randy Pen
eacc5937e4
fix example code
...
In the AuthorSpider, original css selector failed to get links of author pages
2016-10-21 10:52:50 +02:00
Paul Tremberth
6df48d57e0
Bump version: 1.2.0 → 1.2.1
1.2.1
2016-10-21 10:37:45 +02:00
Paul Tremberth
f0e935735f
Merge pull request #2341 from redapple/release-notes-1.2.1
...
Update release notes for upcoming 1.2.1
2016-10-21 10:34:47 +02:00
Paul Tremberth
c559a64c20
Set date for 1.2.1 release
2016-10-21 10:17:46 +02:00
Paul Tremberth
22772a61c8
Update release notes for upcoming 1.2.1
2016-10-21 10:17:04 +02:00
Paul Tremberth
eb5d396527
Merge pull request #2344 from jaympatel/master
...
Typo (through was misspelled)
2016-10-20 22:24:14 +02:00
Jay Patel
f357ccd0d7
Typo (through was misspelled)
2016-10-20 15:52:28 -04:00
Mikhail Korobov
871eec9827
Merge pull request #2327 from bopace/http-header-docs
...
[MRG+1] Added documentation about accessing header values
2016-10-20 09:05:14 +02:00
Mikhail Korobov
71c8278f57
Merge pull request #2329 from josericardo/scrapy-2262
...
[MRG+1] Better explain middleware orders and processing directions
2016-10-20 09:04:19 +02:00
Paul Tremberth
db40852892
Do not interpret non-ASCII bytes in "Location" and percent-encode them ( #2322 )
...
* Do not interpret non-ASCII bytes in "Location" and percent-encode them
Fixes GH-2321
The idea is to not guess the encoding of "Location" header value
and simply percent-encode non-ASCII bytes,
which should then be re-interpreted correctly by the remote website
in whatever encoding was used originally.
See https://tools.ietf.org/html/rfc3987#section-3.2
This is similar to the changes to safe_url_string in
https://github.com/scrapy/w3lib/pull/45
* Remove unused import
2016-10-19 23:26:12 -03:00
muriloviana
09c401bf8e
use crawler only to register signals in from_crawler()
2016-10-19 14:09:01 -02:00
muriloviana
32fd692810
define method spider_opened and update class instructions
2016-10-19 13:50:09 -02:00
muriloviana
38e292a132
add from_crawler method to template and turn the returns methods explicit
2016-10-19 12:47:23 -02:00
muriloviana
34f2014c55
change settings middleware name and updating middleware template
2016-10-19 00:37:31 -02:00
muriloviana
d5bd44a5b9
add middleware to template project
2016-10-18 17:29:16 -02:00
Victor Sued
f74051e69e
update documentation to explain the use of double quotes on Windows.
2016-10-18 16:36:43 -02:00
bopace
fd016ee71b
Fixed wording of documentation
2016-10-18 09:37:45 -06:00
Paul Tremberth
6cc83c041e
Merge pull request #2330 from lfmattossch/note-python2-name
...
[MRG+1] Added a note about invalid spider names in python 2 (fixes #2251 )
2016-10-18 16:35:19 +02:00
Jose Ricardo
e12e364a40
Add details to the spider middlewares docs
...
Document the effects of the middleware order in a more detailed way.
2016-10-18 12:29:30 -02:00
Mikhail Korobov
a5f4450313
Merge pull request #2302 from Granitosaurus/pipeline_doc_fix
...
[MRG+1] Fix JsonWriterPipeline example in docs
2016-10-18 16:20:59 +02:00