1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-27 23:03:51 +00:00

1904 Commits

Author SHA1 Message Date
Pablo Hoffman
f3240748cb changed link to scheduler middleware doc, now in experimental 2009-09-11 12:03:23 -03:00
Ismael Carnales
3998a0cb58 added more scheduler middleware documentation, and moved it to experimental
--HG--
rename : docs/topics/scheduler-middleware.rst => docs/experimental/scheduler-middleware.rst
2009-09-11 11:58:53 -03:00
Pablo Hoffman
d242a20573 updated images pipeline doc 2009-09-11 11:47:12 -03:00
Daniel Grana
40d38b18d8 Automated merge with ssh://hg.scrapy.org/scrapy 2009-09-11 11:04:47 -03:00
Daniel Grana
96bc6780d3 imagespipeline: change scraped_url to url 2009-09-11 11:03:53 -03:00
Pablo Hoffman
0174bee4bc simplified implementation of scrapy.fetcher 2009-09-10 19:27:47 -03:00
Pablo Hoffman
f1bb8dc2a3 first cleanup of spider manager api
- removed asdict() and reload() methods
- added list() method
- removed default spider
2009-09-10 19:06:46 -03:00
Pablo Hoffman
f85813cd94 added FAQ entry about scrapy recipes and community spiders 2009-09-10 18:32:50 -03:00
Daniel Grana
734464825b allow to override httpclientfactory 2009-09-10 16:28:08 -03:00
Pablo Hoffman
269724a2b7 added Debugger extension, removed StackTraceDump from extensions available by default 2009-09-08 22:32:17 -03:00
Pablo Hoffman
2974c2c4b5 some additional checks on using unicode url/body in Request/Response objects 2009-09-07 15:20:41 -03:00
Pablo Hoffman
7a88c0d8e5 shell: fixed bug when typing exit() in python console - fixes #103 2009-09-07 14:57:50 -03:00
Ismael Carnales
4ddfa9a2a3 stlyed downloaded middleware doc 2009-09-07 12:18:57 -03:00
Ismael Carnales
e3df11e5bb added module directive to spidermw documentation 2009-09-07 12:03:24 -03:00
Ismael Carnales
30c2ad3f0c added urllength spider middleware test 2009-09-07 11:14:47 -03:00
Ismael Carnales
c4ad2bea5d added urlfilter spidermw test 2009-09-07 11:14:47 -03:00
Ismael Carnales
43bd00dea2 added referer spider middleware test 2009-09-07 11:14:46 -03:00
Ismael Carnales
1c700749ae added offside spider middleware test 2009-09-07 11:14:45 -03:00
Ismael Carnales
083635ebaf added depth spider middleware test 2009-09-07 11:14:43 -03:00
Ismael Carnales
f0b5892aa5 fixed stats downloadermw test 2009-09-07 11:14:38 -03:00
Pablo Hoffman
82ca5e26f5 renamed test_xpath.py to test_selector.py
--HG--
rename : scrapy/tests/test_xpath.py => scrapy/tests/test_selector.py
2009-09-07 09:38:24 -03:00
Pablo Hoffman
4023914b10 added support for instantiating TextResponse (or any subclass) with unicode urls, improved organization of request/response unittests 2009-09-07 09:37:46 -03:00
Pablo Hoffman
e2bd1be995 better aws code arrangement
--HG--
rename : scrapy/tests/test_aws.py => scrapy/tests/test_utils_aws.py
2009-09-04 18:07:51 -03:00
Pablo Hoffman
827aa19c6e removed obsolete scrapy.utils.db module 2009-09-04 17:38:14 -03:00
Pablo Hoffman
7466150286 removed some more obsolete middlewares 2009-09-04 17:32:34 -03:00
Pablo Hoffman
861a803cc3 removed obsolete RestrictMiddleware 2009-09-04 17:22:56 -03:00
Pablo Hoffman
0631102153 removed backwards compatibility for old errorpages downloader middlware 2009-09-04 17:19:03 -03:00
Ismael Carnales
043e7355f7 added some missing spidermw tests 2009-09-04 14:11:56 -03:00
Ismael Carnales
7e2587169b added missing middleware docs 2009-09-04 12:39:02 -03:00
Ismael Carnales
6d127d7fcf added some missing middlewares tests 2009-09-04 12:29:43 -03:00
Pablo Hoffman
aefb94063a more updates to spider middleware doc 2009-09-04 13:46:04 -03:00
Pablo Hoffman
d04640be5c some improvements to spider middleware doc 2009-09-04 13:29:16 -03:00
Pablo Hoffman
96bb223c13 removed (pretty useless) DebugMiddleware 2009-09-04 12:59:58 -03:00
Pablo Hoffman
86de5180fc fixed bug in robots middleware reported by fencer in #101 2009-09-04 12:36:29 -03:00
Pablo Hoffman
dad05957f3 added comment downloader backout policy 2009-09-04 01:16:58 -03:00
Daniel Grana
2ae11e9220 meassure downloader backout based on active requests that includes those in downlodermw plus queue 2009-09-03 16:58:36 -03:00
Daniel Grana
8b4304d5eb change csv exporter to check flag inmediately instead of calling another function 2009-09-03 16:26:05 -03:00
Daniel Grana
5ff8ed8686 media_pipeline: let failures reach item_completed
--HG--
extra : rebase_source : ec574d95957646498f021c25953ac1405d1c748c
2009-09-03 16:10:45 -03:00
Pablo Hoffman
8a715701ec fixed another doc typo 2009-09-03 14:31:00 -03:00
Pablo Hoffman
760cf452c5 automatic merge 2009-09-03 14:30:17 -03:00
Pablo Hoffman
0d493615ab some code rearrangement without functionality changes 2009-09-03 14:29:20 -03:00
Daniel Grana
0e7b2a6da5 write header line by default when using csv exporter
--HG--
extra : rebase_source : 2d2d7153dde5e3f77e682e16d2e4408f732f234e
2009-09-03 13:58:39 -03:00
Pablo Hoffman
c33d6bbdbd avoid shutting down the reactor from two places, for now 2009-09-03 12:41:04 -03:00
Ismael Carnales
3c1bb7bc40 fixed typo in djangoitems doc (thanks anibal) 2009-09-03 11:23:25 -03:00
Pablo Hoffman
be76490002 avoid stopping reactor if it's already in shutdown stage (where reactor.running is still True) 2009-09-03 10:25:55 -03:00
Pablo Hoffman
0643010f0c engine: stopping reactor when there's nothing left to do 2009-09-03 09:47:59 -03:00
Pablo Hoffman
a7f5c6f878 Some enhancements to Scrapy core:
- added graceful shutdown (with one ^C) and forced shutdown (two ^C)
- added optional loglevel to IgnoreRequest exception
- calling Spider Manager close_domain() function from engine to avoid race
  conditions caused by signal dispatch order
- moved twisted reactor controlling logic outside the engine and into the
  scrapy manager
- made log.err function respect the current log level filter
2009-09-03 08:27:48 -03:00
Pablo Hoffman
51eb641fc7 fix bug in scrapy shell which was hiding the objects fetch/view/shelp when started without a url as argument 2009-09-02 20:53:55 -03:00
Pablo Hoffman
37929f42b3 spider manager: added protection to avoid reloading non-spider modules 2009-09-02 20:43:05 -03:00
Pablo Hoffman
11f296b753 removed hack for switching standard descriptors in favor of using LOG_STDOUT=False by default 2009-09-02 16:22:45 -03:00