Pablo Hoffman
8e49fed918
minor improvements to benchmarking doc
2013-05-16 13:23:13 -03:00
Pablo Hoffman
76087e336a
add scrapy bench command for benchmarking, with documentation
2013-05-16 13:15:25 -03:00
Pablo Hoffman
5c40741d65
added context manager for mock server, moved test spiders into a separate module (scrapy.tests.spiders)
2013-05-16 13:01:02 -03:00
Pablo Hoffman
214bcdf3be
Merge pull request #301 from tpeng/fix_telnet_doc
...
change manager in telnet variable to crawler
2013-05-08 07:00:52 -07:00
tpeng
edbe525d19
change manager in telnet variable to crawler
2013-05-08 11:58:15 +08:00
Pablo Hoffman
99f8d5733a
added mock tests for retries (more to come)
2013-05-06 20:04:15 -03:00
Pablo Hoffman
a7b7c89216
improved logging of downloader errors, when they contain tracebacks
2013-05-06 20:03:54 -03:00
Pablo Hoffman
db195ee4df
added more methods to mockserver
2013-05-06 20:03:17 -03:00
Pablo Hoffman
c2561ca160
use get_testenv() shortcut instead of get_pythonpath()
2013-05-06 17:35:09 -03:00
Pablo Hoffman
14e7121674
make sure the mockserver forked from tests uses the current installation of scrapy, not the system one
2013-05-06 17:27:59 -03:00
Pablo Hoffman
cf4d4bc094
added mock server test for DOWNLOAD_TIMEOUT
2013-05-06 14:47:24 -03:00
Daniel Graña
495acba223
agents requires an instance of contextFactory
2013-05-06 12:56:13 -03:00
Daniel Graña
3e62ce35a0
cleanup http connection pool on engine stop
2013-05-06 12:56:13 -03:00
Daniel Graña
334ad71a2e
adapt for singletons removals
2013-05-06 12:56:13 -03:00
Daniel Graña
ba6545555f
empty bodies does not require a body producer
2013-05-06 12:56:13 -03:00
Daniel Graña
db232da068
close pool connections before finishing tests
2013-05-06 12:56:13 -03:00
Daniel Graña
b34606030d
remove duplicate context factory handling for non-ssl support in http1.0
2013-05-06 12:56:13 -03:00
Daniel Graña
0a26170086
move ssl context factory to its own module and implement a non-ssl version that warns about pyopenssl support
2013-05-06 12:52:58 -03:00
Daniel Graña
ab3407289c
enable persistent connections
2013-05-06 12:50:14 -03:00
Daniel Graña
e4fe7c63b0
add http connection pool and custom ssl context factory
2013-05-06 12:50:13 -03:00
Daniel Graña
a7a354f982
http11 cleanup
2013-05-06 12:45:27 -03:00
paul
ef03603869
Restore handling of HTTPS
2013-05-06 12:45:27 -03:00
paul
46341d5275
Renamed downloader to Http11DownloadHandler and some refactoring
...
Only for HTTP, not HTTPS
Test on expected body length instead of request method (HEAD case)
2013-05-06 12:45:27 -03:00
paul
4018d25a9b
Use twisted.web.client.Agent for download requests (use of HTTP/1.1)
...
Adds http11.HttpDownloadHandler in scrapy.core.downloader.handlers
2013-05-06 12:45:27 -03:00
Pablo Hoffman
66311db23e
mention crawlera in best practices, as a way to deal with bans
2013-05-04 18:20:23 -03:00
Pablo Hoffman
e1f4144391
removed obsolete (and broken) downloader method: is_idle()
2013-04-29 13:04:40 -03:00
Daniel Graña
2a1a4477d3
Merge pull request #297 from scrapy/downloader-gc
...
Add garbage collector to downloader
2013-04-29 07:38:27 -07:00
Pablo Hoffman
af5c13fa14
Add garbage collector to downloader
...
This fixes a couple of issues:
- reactor callLater leaks when using download delay (test was
re-enabled)
- downloader slot leaking on broad crawls (slots were created but never
removed)
2013-04-29 10:16:30 -03:00
Pablo Hoffman
36ee36000e
bind mockserver to 0.0.0.0, to listen on all 127.* range (useful for testing broad crawls)
2013-04-27 04:17:19 -03:00
Pablo Hoffman
9361c89573
remove scrapyd doc, as it was moved to its own repo
2013-04-27 04:15:42 -03:00
Daniel Graña
5ba2b60a4b
fix broken doctests
2013-04-25 11:56:56 -03:00
Daniel Graña
32b781a1b8
cookies: increase candidate list with dot prefixed domains
2013-04-24 16:22:40 -03:00
Daniel Graña
31bec08a39
rfc2965 is dead
...
- It is not enabled by default in python cookielib
- Mozilla rejected implementing it https://bugzilla.mozilla.org/show_bug.cgi?id=208985
- Netscape cookies still rules
- It was superseded by RFC6265 which is the facto protocol formalized
2013-04-24 15:07:40 -03:00
Daniel Graña
973906153c
Merge pull request #77 from shane42/master
...
Cookie handling performance improvement
2013-04-24 10:59:36 -07:00
Pablo Hoffman
d02da2f31f
ported code to use queuelib
2013-04-23 17:48:09 -03:00
Daniel Graña
5531290d53
Merge pull request #292 from nramirezuy/item-unusedimport
...
deleted unused import
2013-04-19 10:53:20 -07:00
Nicolás Ramírez
bc592e9958
deleted unused import
2013-04-19 14:51:59 -03:00
Pablo Hoffman
7a1536f76e
Merge pull request #290 from nramirezuy/item-copy
...
added copy method to item
2013-04-19 09:27:44 -07:00
Nicolás Ramírez
6df274bba5
added copy method to item
2013-04-19 13:23:53 -03:00
Pablo Hoffman
98f89d1c3f
Merge pull request #291 from nramirezuy/cmd-parse-pipelines
...
parse pipelines test fixed
2013-04-19 09:22:38 -07:00
Pablo Hoffman
626331b865
test_crawl: make mock server print a line when ready, and wait for that line to start tests, instead of waiting for an arbitrary time
2013-04-19 13:06:30 -03:00
Nicolás Ramírez
e0c88f2d93
test fixed
2013-04-19 12:38:20 -03:00
Daniel Graña
ce177fa5bc
mock server is slow bringing up on busy builders
2013-04-19 14:31:27 +00:00
Pablo Hoffman
74e9aecc8d
Merge pull request #288 from kmike/patch-1
...
Update faq.rst
2013-04-17 17:55:45 -07:00
Mikhail Korobov
b245d592aa
Update faq.rst
...
spider.DOWNLOAD_DELAY is deprecated
2013-04-18 02:42:15 +06:00
Pablo Hoffman
9feb65865c
Merge pull request #284 from nramirezuy/cmd-parse-pipelines
...
Command parse, --pipelines argument added
2013-04-09 06:26:24 -07:00
Pablo Hoffman
adf38a65e9
Merge pull request #283 from opyate/patch-1
...
Update overview.rst, Torrent referenced as TorrentItem in spider
2013-04-08 13:48:05 -07:00
Nicolás Ramírez
2b39527f72
pipelines argument added
2013-04-08 14:55:28 -03:00
Juan M Uys
4de3aa4932
Update overview.rst
2013-04-08 14:13:15 +02:00
Pablo Hoffman
96c2332e0e
fix inaccurate downloader middleware documentation. refs #280
2013-04-02 11:35:32 -03:00