1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-03-01 10:53:06 +00:00

4686 Commits

Author SHA1 Message Date
VKen
33a7c1d438 updated deprecated cgi.parse_qsl to use six's parse_qsl 2014-10-03 04:16:21 +08:00
Mikhail Korobov
ea3b372b4f DOC typo fix in leaks.rst 2014-10-02 15:20:13 +06:00
Pablo Hoffman
993b543e1b mark SEP-019 as Final 2014-10-02 01:17:31 -03:00
Pablo Hoffman
e7843d35de Merge pull request #894 from kmike/leaks-docs
Leaks docs
2014-10-02 01:14:54 -03:00
Pablo Hoffman
5835224eee Merge pull request #896 from scrapy/robotstxt-once
[MRG] process robots.txt once
2014-10-02 00:58:55 -03:00
Pablo Hoffman
9e2c60430d Merge pull request #902 from scrapy/load_object_full_traceback
[MRG] scrapy.utils.misc.load_object should print full traceback
2014-10-02 00:33:27 -03:00
Pablo Hoffman
0c23b1342b Merge pull request #904 from scrapy/from_crawler_docs
[MRG] DOC document from_crawler method for item pipelines
2014-10-02 00:09:07 -03:00
Mikhail Korobov
6fcf9dce50 DOC document from_crawler method for item pipelines; add an example. 2014-09-25 03:13:51 +06:00
Mikhail Korobov
5086262913 don't hide original exception in scrapy.utils.misc.load_object 2014-09-24 13:27:14 +06:00
Mikhail Korobov
36eec8f413 dont_obey_robotstxt meta key; don't process requests to /robots.txt 2014-09-23 00:10:43 +06:00
Mikhail Korobov
fe6f3efe95 RobotsTxtMiddleware: remove unused attribute 2014-09-22 22:56:54 +06:00
Mikhail Korobov
d11c8595e6 drop support for CONCURRENT_REQUESTS_PER_SPIDER 2014-09-22 04:29:22 +06:00
Mikhail Korobov
bdbca1e2d7 DOC request queue memory usage 2014-09-21 07:30:44 +06:00
Mikhail Korobov
bc0f481a73 DOC bring back notes about multiple spiders per process because it is now documented how to do that 2014-09-21 07:12:01 +06:00
Mikhail Korobov
a122fdbfea Update leaks.rst: there is now only a single spider in a process. 2014-09-21 06:54:00 +06:00
Mikhail Korobov
7be3479c20 CookieJar cleanup 2014-09-21 06:37:32 +06:00
Mikhail Korobov
49645d4bf9 TST small cleanup of a cookie test 2014-09-21 05:31:34 +06:00
Mikhail Korobov
c543fe6e4c Merge pull request #878 from andrewshir/master
Fix bug for ".local" host name
2014-09-21 05:24:54 +06:00
Mikhail Korobov
e435b3e3a3 DOC simplify extension docs 2014-09-21 00:19:24 +06:00
John-Scott Atlakson
a312ebfb43 Update request-response.rst
Fixed minor typo
2014-09-14 22:06:31 +06:00
andrewshir
e583c030db Test for local domains (without dots) added 2014-09-14 14:24:16 +06:00
Mikael Åhlén
22da1783bd added a test-case for wrong quotechar 2014-09-13 03:47:40 +02:00
Mikael Åhlén
47b6dff9f1 Allow to specify the quotechar in CSVFeedSpider 2014-09-13 02:14:57 +02:00
Daniel Graña
5bcabfe9c9 SPIDER_MODULES can be set as a csv string 2014-09-10 23:25:57 -03:00
Daniel Graña
c05e99a4f4 oops, restore Pillow from precise test requirements 2014-09-10 12:21:08 -03:00
Daniel Graña
a823207f18 Stop logobserver only when set 2014-09-10 12:09:07 -03:00
Daniel Graña
ec93c0fdcc Add the tests changes for previous commit 2014-09-10 12:05:18 -03:00
Daniel Graña
ce180227fa Twisted 11.1.0 (precise) can not deal with generators in DeferredList
Also create a list of the crawlers before iterating them because crawlers are removed from the set once stopped
2014-09-10 12:04:14 -03:00
Daniel Graña
99971dc8a8 Do not pop the crawler from the managed list 2014-09-09 20:59:07 +00:00
Daniel Graña
774aa9ee56 Merge branch 'crawler-api' 2014-09-09 17:23:11 -03:00
Daniel Graña
8ddf0811a8 Correctly detect when all managed crawlers are done in CrawlerRunner 2014-09-09 17:21:39 -03:00
Daniel Graña
68954fa503 Merge pull request #879 from Curita/fix-874-issue
Fix #874 issue
2014-09-08 14:32:06 -03:00
Julia Medina
51532af69a Erase unneeded flag in CrawlerProcess.start 2014-09-07 13:03:34 -03:00
Julia Medina
d513b5a542 Run root logger in CrawlerProcess creation instead of in its start method 2014-09-07 13:02:39 -03:00
andrewshir
dfca7b3c80 Fix bug for ".local" host name
It's necessary to put new list member in squared brackets (i.e. create new list) to merge lists properly, otherwise we will get result list with character elements instead of string element.
2014-09-06 18:23:27 +06:00
Pablo Hoffman
acb0a61cf1 Merge pull request #871 from eltermann/master
Removed unused 'load=False' parameter from walk_modules()
2014-09-02 12:55:04 -03:00
eltermann
1dff1fbf75 Removed unused 'load=False' parameter from walk_modules() 2014-09-02 08:33:36 -03:00
Daniel Graña
5daa14770b Merge branch 'Curita-per-spider-settings' 2014-09-01 21:58:13 -03:00
Julia Medina
c2592b39fd Test verifying that CrawlerRunner populates spider class settings 2014-09-01 21:56:57 -03:00
Julia Medina
77bd26a66d Non mutable default in Spider.custom_settings 2014-09-01 21:56:57 -03:00
Julia Medina
16e62e9c9b Per-spider settings documentation 2014-09-01 21:56:57 -03:00
Julia Medina
9ef3972cfb Per-spider settings tests 2014-09-01 21:56:57 -03:00
Julia Medina
4932ec43a7 Per-spider settings implementation 2014-09-01 21:56:57 -03:00
Daniel Graña
ccde3317d7 Merge pull request #816 from Curita/api-cleanup
GSoC API cleanup
2014-09-01 21:55:36 -03:00
Pablo Hoffman
620dbe7116 Merge pull request #865 from yakxxx/master
SgmlLinkExtractor - fix for parsing <area> tag with Unicode present
2014-08-29 16:18:44 -03:00
Mikhail Korobov
8aa731c87f Merge pull request #866 from adamdonahue/patch-1
Fix typo
2014-08-29 06:59:47 +06:00
Adam Donahue
d92914d297 Fix typo 2014-08-28 20:30:50 -04:00
yakxxx
e4689556f0 SgmlLinkExtractor - fix for parsing <area> tag with Unicode present 2014-08-28 18:55:58 +02:00
Mikhail Korobov
774ab74ad2 Merge pull request #864 from younghz/master
Duplicate comma in request-response.rst
2014-08-28 18:52:51 +06:00
Uyounghz
d49766a6ac Duplicate comma in request-response.rst 2014-08-28 19:58:58 +08:00