1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-27 02:43:41 +00:00

3630 Commits

Author SHA1 Message Date
Pablo Hoffman
8e49fed918 minor improvements to benchmarking doc 2013-05-16 13:23:13 -03:00
Pablo Hoffman
76087e336a add scrapy bench command for benchmarking, with documentation 2013-05-16 13:15:25 -03:00
Pablo Hoffman
5c40741d65 added context manager for mock server, moved test spiders into a separate module (scrapy.tests.spiders) 2013-05-16 13:01:02 -03:00
Pablo Hoffman
214bcdf3be Merge pull request #301 from tpeng/fix_telnet_doc
change manager in telnet variable to crawler
2013-05-08 07:00:52 -07:00
tpeng
edbe525d19 change manager in telnet variable to crawler 2013-05-08 11:58:15 +08:00
Pablo Hoffman
99f8d5733a added mock tests for retries (more to come) 2013-05-06 20:04:15 -03:00
Pablo Hoffman
a7b7c89216 improved logging of downloader errors, when they contain tracebacks 2013-05-06 20:03:54 -03:00
Pablo Hoffman
db195ee4df added more methods to mockserver 2013-05-06 20:03:17 -03:00
Pablo Hoffman
c2561ca160 use get_testenv() shortcut instead of get_pythonpath() 2013-05-06 17:35:09 -03:00
Pablo Hoffman
14e7121674 make sure the mockserver forked from tests uses the current installation of scrapy, not the system one 2013-05-06 17:27:59 -03:00
Pablo Hoffman
cf4d4bc094 added mock server test for DOWNLOAD_TIMEOUT 2013-05-06 14:47:24 -03:00
Daniel Graña
495acba223 agents requires an instance of contextFactory 2013-05-06 12:56:13 -03:00
Daniel Graña
3e62ce35a0 cleanup http connection pool on engine stop 2013-05-06 12:56:13 -03:00
Daniel Graña
334ad71a2e adapt for singletons removals 2013-05-06 12:56:13 -03:00
Daniel Graña
ba6545555f empty bodies does not require a body producer 2013-05-06 12:56:13 -03:00
Daniel Graña
db232da068 close pool connections before finishing tests 2013-05-06 12:56:13 -03:00
Daniel Graña
b34606030d remove duplicate context factory handling for non-ssl support in http1.0 2013-05-06 12:56:13 -03:00
Daniel Graña
0a26170086 move ssl context factory to its own module and implement a non-ssl version that warns about pyopenssl support 2013-05-06 12:52:58 -03:00
Daniel Graña
ab3407289c enable persistent connections 2013-05-06 12:50:14 -03:00
Daniel Graña
e4fe7c63b0 add http connection pool and custom ssl context factory 2013-05-06 12:50:13 -03:00
Daniel Graña
a7a354f982 http11 cleanup 2013-05-06 12:45:27 -03:00
paul
ef03603869 Restore handling of HTTPS 2013-05-06 12:45:27 -03:00
paul
46341d5275 Renamed downloader to Http11DownloadHandler and some refactoring
Only for HTTP, not HTTPS
Test on expected body length instead of request method (HEAD case)
2013-05-06 12:45:27 -03:00
paul
4018d25a9b Use twisted.web.client.Agent for download requests (use of HTTP/1.1)
Adds http11.HttpDownloadHandler in scrapy.core.downloader.handlers
2013-05-06 12:45:27 -03:00
Pablo Hoffman
66311db23e mention crawlera in best practices, as a way to deal with bans 2013-05-04 18:20:23 -03:00
Pablo Hoffman
e1f4144391 removed obsolete (and broken) downloader method: is_idle() 2013-04-29 13:04:40 -03:00
Daniel Graña
2a1a4477d3 Merge pull request #297 from scrapy/downloader-gc
Add garbage collector to downloader
2013-04-29 07:38:27 -07:00
Pablo Hoffman
af5c13fa14 Add garbage collector to downloader
This fixes a couple of issues:
- reactor callLater leaks when using download delay (test was
  re-enabled)
- downloader slot leaking on broad crawls (slots were created but never
  removed)
2013-04-29 10:16:30 -03:00
Pablo Hoffman
36ee36000e bind mockserver to 0.0.0.0, to listen on all 127.* range (useful for testing broad crawls) 2013-04-27 04:17:19 -03:00
Pablo Hoffman
9361c89573 remove scrapyd doc, as it was moved to its own repo 2013-04-27 04:15:42 -03:00
Daniel Graña
5ba2b60a4b fix broken doctests 2013-04-25 11:56:56 -03:00
Daniel Graña
32b781a1b8 cookies: increase candidate list with dot prefixed domains 2013-04-24 16:22:40 -03:00
Daniel Graña
31bec08a39 rfc2965 is dead
- It is not enabled by default in python cookielib
- Mozilla rejected implementing it https://bugzilla.mozilla.org/show_bug.cgi?id=208985
- Netscape cookies still rules
- It was superseded by RFC6265 which is the facto protocol formalized
2013-04-24 15:07:40 -03:00
Daniel Graña
973906153c Merge pull request #77 from shane42/master
Cookie handling performance improvement
2013-04-24 10:59:36 -07:00
Pablo Hoffman
d02da2f31f ported code to use queuelib 2013-04-23 17:48:09 -03:00
Daniel Graña
5531290d53 Merge pull request #292 from nramirezuy/item-unusedimport
deleted unused import
2013-04-19 10:53:20 -07:00
Nicolás Ramírez
bc592e9958 deleted unused import 2013-04-19 14:51:59 -03:00
Pablo Hoffman
7a1536f76e Merge pull request #290 from nramirezuy/item-copy
added copy method to item
2013-04-19 09:27:44 -07:00
Nicolás Ramírez
6df274bba5 added copy method to item 2013-04-19 13:23:53 -03:00
Pablo Hoffman
98f89d1c3f Merge pull request #291 from nramirezuy/cmd-parse-pipelines
parse pipelines test fixed
2013-04-19 09:22:38 -07:00
Pablo Hoffman
626331b865 test_crawl: make mock server print a line when ready, and wait for that line to start tests, instead of waiting for an arbitrary time 2013-04-19 13:06:30 -03:00
Nicolás Ramírez
e0c88f2d93 test fixed 2013-04-19 12:38:20 -03:00
Daniel Graña
ce177fa5bc mock server is slow bringing up on busy builders 2013-04-19 14:31:27 +00:00
Pablo Hoffman
74e9aecc8d Merge pull request #288 from kmike/patch-1
Update faq.rst
2013-04-17 17:55:45 -07:00
Mikhail Korobov
b245d592aa Update faq.rst
spider.DOWNLOAD_DELAY is deprecated
2013-04-18 02:42:15 +06:00
Pablo Hoffman
9feb65865c Merge pull request #284 from nramirezuy/cmd-parse-pipelines
Command parse, --pipelines argument added
2013-04-09 06:26:24 -07:00
Pablo Hoffman
adf38a65e9 Merge pull request #283 from opyate/patch-1
Update overview.rst, Torrent referenced as TorrentItem in spider
2013-04-08 13:48:05 -07:00
Nicolás Ramírez
2b39527f72 pipelines argument added 2013-04-08 14:55:28 -03:00
Juan M Uys
4de3aa4932 Update overview.rst 2013-04-08 14:13:15 +02:00
Pablo Hoffman
96c2332e0e fix inaccurate downloader middleware documentation. refs #280 2013-04-02 11:35:32 -03:00