1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-03-01 14:47:30 +00:00

4676 Commits

Author SHA1 Message Date
Pablo Hoffman
78954bc7f4 Merge pull request #978 from immerrr/fix-s3-authorization-for-quoted-paths
S3DownloadHandler: fix auth for requests with quoted paths/query params
2014-12-15 17:15:11 -02:00
tpeng
82d138e87e support namespace prefix in xmliter_lxml 2014-12-15 16:17:06 +01:00
Artur Gaspar
22247cf791 move restrict_css argument to end of argument list in link extractors for backwards compatibility, use keyword arguments in link extractor super().__init__() calls 2014-12-15 09:18:15 -02:00
Artur Gaspar
b0730a1d16 documentation for CSS support in link extractors 2014-12-11 18:22:08 -02:00
Artur Gaspar
403fc686b8 tests for CSS support in link extractors 2014-12-11 18:20:30 -02:00
Artur Gaspar
d4cb03eded add CSS support for link extractors 2014-12-11 16:45:46 -02:00
immerrr
82b187f283 S3DownloadHandler: fix auth for requests with quoted paths/query params 2014-12-11 18:14:36 +03:00
Mikhail Korobov
c485a05540 Merge pull request #976 from aufziehvogel/fixed_mailsender_vartypes
fixed the variable types in mailsender documentation
2014-12-11 03:02:33 +05:00
Stefan
3602fc4fcb fixed the variable types in mailsender documentation 2014-12-10 22:48:09 +01:00
Pablo Hoffman
1b03b12996 Merge pull request #961 from ldmberman/task_977_request_dropped
An attempt to resolve #957, add signal to be sent when request is dropped by the scheduler
2014-12-02 19:46:43 -02:00
Lev Berman
fdb6bb07c0 #977 - test dropping requests 2014-11-28 10:53:33 +03:00
Lev Berman
e04b0aff74 An attempt to resolve #977, add signal to be sent when request is dropped by the scheduler 2014-11-27 15:10:15 +03:00
Pablo Hoffman
c31fb87335 Merge pull request #954 from kalessin/int-download-timeout
Force to read DOWNLOAD_TIMEOUT as int (for example to pass using environment variable)
2014-11-26 17:14:18 -02:00
Pablo Hoffman
dedea72774 Merge pull request #946 from tpeng/limit-response-size
avoid download large response
2014-11-25 17:57:55 -02:00
tpeng
cd19382754 attemp to fix travis fails 2014-11-25 14:20:25 +01:00
Daniel Graña
8d8e1b2c0c mitmproxy 0.10.1 needs netlib 0.10.1 too 2014-11-21 12:15:02 -02:00
Daniel Graña
314db3db8b pin mitmproxy 0.10.1 as >0.11 does not work with tests 2014-11-21 10:54:43 -02:00
Martin Olveyra
7910fa0172 Force to read DOWNLOAD_TIMEOUT as int (for example to pass using
environment variable)
2014-11-21 01:09:32 -02:00
tpeng
a69f042d10 add 2 more test cases and minor doc fixes 2014-11-19 15:31:07 +01:00
tpeng
fa84730e70 avoid download large response
introduce DOWNLOAD_MAXSIZE and DOWNLOAD_WARNSIZE in settings and
download_maxsize/download_warnsize in spider/request meta, so
downloader stop downloading as soon as the received data exceed the
limit. also check the twsisted response's length in advance to stop
downloading as early as possible.
2014-11-12 12:28:02 +01:00
Pablo Hoffman
ed84231b60 Merge pull request #944 from JeffPaine/patch-1
Update docs copyright year range
2014-11-10 14:03:00 -02:00
Jeff Paine
b422312a38 Update docs copyright year range 2014-11-09 21:08:27 -05:00
Lazar-T
13f83f0da0 typo 2014-11-10 06:28:41 +05:00
HalfCrazy
b21a28cc9a Afterwords->Afterwards 2014-11-10 06:28:09 +05:00
Daniel Graña
2c67bd6c57 pywin32 is required by Twisted. closes #937
see:
* http://twistedmatrix.com/trac/ticket/6032
* https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2028
2014-11-05 23:05:54 -02:00
Daniel Graña
6cb8995731 Update install.rst
fixes #937
2014-11-05 22:50:52 -02:00
Pablo Hoffman
efe589c643 Merge pull request #882 from ahlen/feature/csvfeed-quotechar
[MRG+1] Allow to specify the quotechar in CSVFeedSpider
2014-11-04 11:32:59 -02:00
Lazar-T
38dcf50cd6 comma instead of fullstop 2014-10-25 09:19:50 +06:00
Pablo Hoffman
675fd5ba04 Merge pull request #898 from scrapy/download-timeout
[MRG] DOC document download_timeout
2014-10-24 16:52:42 -02:00
Pablo Hoffman
5de6a11fda Merge pull request #925 from Digenis/master
a leftover for.15 compatibility
2014-10-22 18:23:19 -02:00
Nikolaos-Digenis Karagiannis
2227805619 Compatibility with .15 leftover 2014-10-22 21:48:08 +03:00
Pablo Hoffman
0dce283459 Merge pull request #893 from kmike/less-ads
[MRG] DOC simplify extension docs
2014-10-21 17:13:59 -02:00
Daniel Graña
44cbbecb44 Merge pull request #914 from brunsgaard/master
Deleted bin folder from root, fixes #913
2014-10-09 14:58:36 -02:00
Pablo Hoffman
aa61f615d8 Merge pull request #795 from chekunkov/spider_error_processing_referer
Add referer to "Spider error processing" log message
2014-10-07 20:57:48 -02:00
Jonas Brunsgaard
db2474f7e7 Deleted bin folder from root, fixes #913 2014-10-07 13:54:04 +02:00
Mikhail Korobov
7d68b084a4 DOC document download_timeout Request.meta key and download_timeout spider attribute. 2014-10-07 04:23:11 +06:00
Pablo Hoffman
9af61d5df6 Merge pull request #895 from scrapy/we-are-past-0.15
[MRG] drop support for CONCURRENT_REQUESTS_PER_SPIDER
2014-10-03 17:14:00 -03:00
Pablo Hoffman
5f1bbe2dd3 Merge pull request #911 from nyov/nyov/dev
Drop old engine code
2014-10-03 17:13:17 -03:00
nyov
7db6bbce27 Drop old engine code
* remove Downloader import unused since 1fba64
  * remove CONCURRENT_SPIDERS deprecation warning from a1dbc6 (2011)
2014-10-03 19:07:51 +00:00
Mikhail Korobov
14957ed71e Merge pull request #909 from VKen/master
updated deprecated cgi.parse_qsl to use six's parse_qsl
2014-10-03 16:30:39 +06:00
VKen
33a7c1d438 updated deprecated cgi.parse_qsl to use six's parse_qsl 2014-10-03 04:16:21 +08:00
Mikhail Korobov
ea3b372b4f DOC typo fix in leaks.rst 2014-10-02 15:20:13 +06:00
Pablo Hoffman
993b543e1b mark SEP-019 as Final 2014-10-02 01:17:31 -03:00
Pablo Hoffman
e7843d35de Merge pull request #894 from kmike/leaks-docs
Leaks docs
2014-10-02 01:14:54 -03:00
Pablo Hoffman
5835224eee Merge pull request #896 from scrapy/robotstxt-once
[MRG] process robots.txt once
2014-10-02 00:58:55 -03:00
Pablo Hoffman
9e2c60430d Merge pull request #902 from scrapy/load_object_full_traceback
[MRG] scrapy.utils.misc.load_object should print full traceback
2014-10-02 00:33:27 -03:00
Pablo Hoffman
0c23b1342b Merge pull request #904 from scrapy/from_crawler_docs
[MRG] DOC document from_crawler method for item pipelines
2014-10-02 00:09:07 -03:00
Mikhail Korobov
6fcf9dce50 DOC document from_crawler method for item pipelines; add an example. 2014-09-25 03:13:51 +06:00
Mikhail Korobov
5086262913 don't hide original exception in scrapy.utils.misc.load_object 2014-09-24 13:27:14 +06:00
Mikhail Korobov
36eec8f413 dont_obey_robotstxt meta key; don't process requests to /robots.txt 2014-09-23 00:10:43 +06:00