1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-25 20:44:18 +00:00

4596 Commits

Author SHA1 Message Date
Nikolaos-Digenis Karagiannis
2227805619 Compatibility with .15 leftover 2014-10-22 21:48:08 +03:00
Pablo Hoffman
0dce283459 Merge pull request #893 from kmike/less-ads
[MRG] DOC simplify extension docs
2014-10-21 17:13:59 -02:00
Daniel Graña
44cbbecb44 Merge pull request #914 from brunsgaard/master
Deleted bin folder from root, fixes #913
2014-10-09 14:58:36 -02:00
Pablo Hoffman
aa61f615d8 Merge pull request #795 from chekunkov/spider_error_processing_referer
Add referer to "Spider error processing" log message
2014-10-07 20:57:48 -02:00
Jonas Brunsgaard
db2474f7e7 Deleted bin folder from root, fixes #913 2014-10-07 13:54:04 +02:00
Mikhail Korobov
7d68b084a4 DOC document download_timeout Request.meta key and download_timeout spider attribute. 2014-10-07 04:23:11 +06:00
Pablo Hoffman
9af61d5df6 Merge pull request #895 from scrapy/we-are-past-0.15
[MRG] drop support for CONCURRENT_REQUESTS_PER_SPIDER
2014-10-03 17:14:00 -03:00
Pablo Hoffman
5f1bbe2dd3 Merge pull request #911 from nyov/nyov/dev
Drop old engine code
2014-10-03 17:13:17 -03:00
nyov
7db6bbce27 Drop old engine code
* remove Downloader import unused since 1fba64
  * remove CONCURRENT_SPIDERS deprecation warning from a1dbc6 (2011)
2014-10-03 19:07:51 +00:00
Mikhail Korobov
14957ed71e Merge pull request #909 from VKen/master
updated deprecated cgi.parse_qsl to use six's parse_qsl
2014-10-03 16:30:39 +06:00
VKen
33a7c1d438 updated deprecated cgi.parse_qsl to use six's parse_qsl 2014-10-03 04:16:21 +08:00
Mikhail Korobov
ea3b372b4f DOC typo fix in leaks.rst 2014-10-02 15:20:13 +06:00
Pablo Hoffman
993b543e1b mark SEP-019 as Final 2014-10-02 01:17:31 -03:00
Pablo Hoffman
e7843d35de Merge pull request #894 from kmike/leaks-docs
Leaks docs
2014-10-02 01:14:54 -03:00
Pablo Hoffman
5835224eee Merge pull request #896 from scrapy/robotstxt-once
[MRG] process robots.txt once
2014-10-02 00:58:55 -03:00
Pablo Hoffman
9e2c60430d Merge pull request #902 from scrapy/load_object_full_traceback
[MRG] scrapy.utils.misc.load_object should print full traceback
2014-10-02 00:33:27 -03:00
Pablo Hoffman
0c23b1342b Merge pull request #904 from scrapy/from_crawler_docs
[MRG] DOC document from_crawler method for item pipelines
2014-10-02 00:09:07 -03:00
Mikhail Korobov
6fcf9dce50 DOC document from_crawler method for item pipelines; add an example. 2014-09-25 03:13:51 +06:00
Mikhail Korobov
5086262913 don't hide original exception in scrapy.utils.misc.load_object 2014-09-24 13:27:14 +06:00
Mikhail Korobov
36eec8f413 dont_obey_robotstxt meta key; don't process requests to /robots.txt 2014-09-23 00:10:43 +06:00
Mikhail Korobov
fe6f3efe95 RobotsTxtMiddleware: remove unused attribute 2014-09-22 22:56:54 +06:00
Mikhail Korobov
d11c8595e6 drop support for CONCURRENT_REQUESTS_PER_SPIDER 2014-09-22 04:29:22 +06:00
Mikhail Korobov
bdbca1e2d7 DOC request queue memory usage 2014-09-21 07:30:44 +06:00
Mikhail Korobov
bc0f481a73 DOC bring back notes about multiple spiders per process because it is now documented how to do that 2014-09-21 07:12:01 +06:00
Mikhail Korobov
a122fdbfea Update leaks.rst: there is now only a single spider in a process. 2014-09-21 06:54:00 +06:00
Mikhail Korobov
7be3479c20 CookieJar cleanup 2014-09-21 06:37:32 +06:00
Mikhail Korobov
49645d4bf9 TST small cleanup of a cookie test 2014-09-21 05:31:34 +06:00
Mikhail Korobov
c543fe6e4c Merge pull request #878 from andrewshir/master
Fix bug for ".local" host name
2014-09-21 05:24:54 +06:00
Mikhail Korobov
e435b3e3a3 DOC simplify extension docs 2014-09-21 00:19:24 +06:00
John-Scott Atlakson
a312ebfb43 Update request-response.rst
Fixed minor typo
2014-09-14 22:06:31 +06:00
andrewshir
e583c030db Test for local domains (without dots) added 2014-09-14 14:24:16 +06:00
Mikael Åhlén
22da1783bd added a test-case for wrong quotechar 2014-09-13 03:47:40 +02:00
Mikael Åhlén
47b6dff9f1 Allow to specify the quotechar in CSVFeedSpider 2014-09-13 02:14:57 +02:00
Daniel Graña
5bcabfe9c9 SPIDER_MODULES can be set as a csv string 2014-09-10 23:25:57 -03:00
Daniel Graña
c05e99a4f4 oops, restore Pillow from precise test requirements 2014-09-10 12:21:08 -03:00
Daniel Graña
a823207f18 Stop logobserver only when set 2014-09-10 12:09:07 -03:00
Daniel Graña
ec93c0fdcc Add the tests changes for previous commit 2014-09-10 12:05:18 -03:00
Daniel Graña
ce180227fa Twisted 11.1.0 (precise) can not deal with generators in DeferredList
Also create a list of the crawlers before iterating them because crawlers are removed from the set once stopped
2014-09-10 12:04:14 -03:00
Daniel Graña
99971dc8a8 Do not pop the crawler from the managed list 2014-09-09 20:59:07 +00:00
Daniel Graña
774aa9ee56 Merge branch 'crawler-api' 2014-09-09 17:23:11 -03:00
Daniel Graña
8ddf0811a8 Correctly detect when all managed crawlers are done in CrawlerRunner 2014-09-09 17:21:39 -03:00
Daniel Graña
68954fa503 Merge pull request #879 from Curita/fix-874-issue
Fix #874 issue
2014-09-08 14:32:06 -03:00
Julia Medina
51532af69a Erase unneeded flag in CrawlerProcess.start 2014-09-07 13:03:34 -03:00
Julia Medina
d513b5a542 Run root logger in CrawlerProcess creation instead of in its start method 2014-09-07 13:02:39 -03:00
andrewshir
dfca7b3c80 Fix bug for ".local" host name
It's necessary to put new list member in squared brackets (i.e. create new list) to merge lists properly, otherwise we will get result list with character elements instead of string element.
2014-09-06 18:23:27 +06:00
Pablo Hoffman
acb0a61cf1 Merge pull request #871 from eltermann/master
Removed unused 'load=False' parameter from walk_modules()
2014-09-02 12:55:04 -03:00
eltermann
1dff1fbf75 Removed unused 'load=False' parameter from walk_modules() 2014-09-02 08:33:36 -03:00
Daniel Graña
5daa14770b Merge branch 'Curita-per-spider-settings' 2014-09-01 21:58:13 -03:00
Julia Medina
c2592b39fd Test verifying that CrawlerRunner populates spider class settings 2014-09-01 21:56:57 -03:00
Julia Medina
77bd26a66d Non mutable default in Spider.custom_settings 2014-09-01 21:56:57 -03:00