Pablo Hoffman
f3240748cb
changed link to scheduler middleware doc, now in experimental
2009-09-11 12:03:23 -03:00
Ismael Carnales
3998a0cb58
added more scheduler middleware documentation, and moved it to experimental
...
--HG--
rename : docs/topics/scheduler-middleware.rst => docs/experimental/scheduler-middleware.rst
2009-09-11 11:58:53 -03:00
Pablo Hoffman
d242a20573
updated images pipeline doc
2009-09-11 11:47:12 -03:00
Daniel Grana
40d38b18d8
Automated merge with ssh://hg.scrapy.org/scrapy
2009-09-11 11:04:47 -03:00
Daniel Grana
96bc6780d3
imagespipeline: change scraped_url to url
2009-09-11 11:03:53 -03:00
Pablo Hoffman
0174bee4bc
simplified implementation of scrapy.fetcher
2009-09-10 19:27:47 -03:00
Pablo Hoffman
f1bb8dc2a3
first cleanup of spider manager api
...
- removed asdict() and reload() methods
- added list() method
- removed default spider
2009-09-10 19:06:46 -03:00
Pablo Hoffman
f85813cd94
added FAQ entry about scrapy recipes and community spiders
2009-09-10 18:32:50 -03:00
Daniel Grana
734464825b
allow to override httpclientfactory
2009-09-10 16:28:08 -03:00
Pablo Hoffman
269724a2b7
added Debugger extension, removed StackTraceDump from extensions available by default
2009-09-08 22:32:17 -03:00
Pablo Hoffman
2974c2c4b5
some additional checks on using unicode url/body in Request/Response objects
2009-09-07 15:20:41 -03:00
Pablo Hoffman
7a88c0d8e5
shell: fixed bug when typing exit() in python console - fixes #103
2009-09-07 14:57:50 -03:00
Ismael Carnales
4ddfa9a2a3
stlyed downloaded middleware doc
2009-09-07 12:18:57 -03:00
Ismael Carnales
e3df11e5bb
added module directive to spidermw documentation
2009-09-07 12:03:24 -03:00
Ismael Carnales
30c2ad3f0c
added urllength spider middleware test
2009-09-07 11:14:47 -03:00
Ismael Carnales
c4ad2bea5d
added urlfilter spidermw test
2009-09-07 11:14:47 -03:00
Ismael Carnales
43bd00dea2
added referer spider middleware test
2009-09-07 11:14:46 -03:00
Ismael Carnales
1c700749ae
added offside spider middleware test
2009-09-07 11:14:45 -03:00
Ismael Carnales
083635ebaf
added depth spider middleware test
2009-09-07 11:14:43 -03:00
Ismael Carnales
f0b5892aa5
fixed stats downloadermw test
2009-09-07 11:14:38 -03:00
Pablo Hoffman
82ca5e26f5
renamed test_xpath.py to test_selector.py
...
--HG--
rename : scrapy/tests/test_xpath.py => scrapy/tests/test_selector.py
2009-09-07 09:38:24 -03:00
Pablo Hoffman
4023914b10
added support for instantiating TextResponse (or any subclass) with unicode urls, improved organization of request/response unittests
2009-09-07 09:37:46 -03:00
Pablo Hoffman
e2bd1be995
better aws code arrangement
...
--HG--
rename : scrapy/tests/test_aws.py => scrapy/tests/test_utils_aws.py
2009-09-04 18:07:51 -03:00
Pablo Hoffman
827aa19c6e
removed obsolete scrapy.utils.db module
2009-09-04 17:38:14 -03:00
Pablo Hoffman
7466150286
removed some more obsolete middlewares
2009-09-04 17:32:34 -03:00
Pablo Hoffman
861a803cc3
removed obsolete RestrictMiddleware
2009-09-04 17:22:56 -03:00
Pablo Hoffman
0631102153
removed backwards compatibility for old errorpages downloader middlware
2009-09-04 17:19:03 -03:00
Ismael Carnales
043e7355f7
added some missing spidermw tests
2009-09-04 14:11:56 -03:00
Ismael Carnales
7e2587169b
added missing middleware docs
2009-09-04 12:39:02 -03:00
Ismael Carnales
6d127d7fcf
added some missing middlewares tests
2009-09-04 12:29:43 -03:00
Pablo Hoffman
aefb94063a
more updates to spider middleware doc
2009-09-04 13:46:04 -03:00
Pablo Hoffman
d04640be5c
some improvements to spider middleware doc
2009-09-04 13:29:16 -03:00
Pablo Hoffman
96bb223c13
removed (pretty useless) DebugMiddleware
2009-09-04 12:59:58 -03:00
Pablo Hoffman
86de5180fc
fixed bug in robots middleware reported by fencer in #101
2009-09-04 12:36:29 -03:00
Pablo Hoffman
dad05957f3
added comment downloader backout policy
2009-09-04 01:16:58 -03:00
Daniel Grana
2ae11e9220
meassure downloader backout based on active requests that includes those in downlodermw plus queue
2009-09-03 16:58:36 -03:00
Daniel Grana
8b4304d5eb
change csv exporter to check flag inmediately instead of calling another function
2009-09-03 16:26:05 -03:00
Daniel Grana
5ff8ed8686
media_pipeline: let failures reach item_completed
...
--HG--
extra : rebase_source : ec574d95957646498f021c25953ac1405d1c748c
2009-09-03 16:10:45 -03:00
Pablo Hoffman
8a715701ec
fixed another doc typo
2009-09-03 14:31:00 -03:00
Pablo Hoffman
760cf452c5
automatic merge
2009-09-03 14:30:17 -03:00
Pablo Hoffman
0d493615ab
some code rearrangement without functionality changes
2009-09-03 14:29:20 -03:00
Daniel Grana
0e7b2a6da5
write header line by default when using csv exporter
...
--HG--
extra : rebase_source : 2d2d7153dde5e3f77e682e16d2e4408f732f234e
2009-09-03 13:58:39 -03:00
Pablo Hoffman
c33d6bbdbd
avoid shutting down the reactor from two places, for now
2009-09-03 12:41:04 -03:00
Ismael Carnales
3c1bb7bc40
fixed typo in djangoitems doc (thanks anibal)
2009-09-03 11:23:25 -03:00
Pablo Hoffman
be76490002
avoid stopping reactor if it's already in shutdown stage (where reactor.running is still True)
2009-09-03 10:25:55 -03:00
Pablo Hoffman
0643010f0c
engine: stopping reactor when there's nothing left to do
2009-09-03 09:47:59 -03:00
Pablo Hoffman
a7f5c6f878
Some enhancements to Scrapy core:
...
- added graceful shutdown (with one ^C) and forced shutdown (two ^C)
- added optional loglevel to IgnoreRequest exception
- calling Spider Manager close_domain() function from engine to avoid race
conditions caused by signal dispatch order
- moved twisted reactor controlling logic outside the engine and into the
scrapy manager
- made log.err function respect the current log level filter
2009-09-03 08:27:48 -03:00
Pablo Hoffman
51eb641fc7
fix bug in scrapy shell which was hiding the objects fetch/view/shelp when started without a url as argument
2009-09-02 20:53:55 -03:00
Pablo Hoffman
37929f42b3
spider manager: added protection to avoid reloading non-spider modules
2009-09-02 20:43:05 -03:00
Pablo Hoffman
11f296b753
removed hack for switching standard descriptors in favor of using LOG_STDOUT=False by default
2009-09-02 16:22:45 -03:00