Pablo Hoffman
66525e77c7
Improved support for scrapy-ctl -> scrapy migration by generating the scrapy.cfg file automatically if it doensn't exist
2010-08-19 17:57:34 -03:00
Pablo Hoffman
3d8151bb26
Added FAQ entry about response code 999
2010-08-19 16:51:51 -03:00
Pablo Hoffman
2ff5a83b7a
Added persistent execution queue (based on SQLite), and a new 'queue' command to control it. Closes #198
2010-08-19 02:55:52 -03:00
Pablo Hoffman
9740ad62a6
Added tests for runspider command
2010-08-19 02:30:15 -03:00
Pablo Hoffman
abd7a5e221
Fixed bug with runspider command that appeared after the introduction of new spider manager in r2121. Also factored out common code shared by new spider manager and runspider command, with tests included.
2010-08-19 01:58:10 -03:00
Pablo Hoffman
85f3a4a603
removed obsolete setting
2010-08-19 00:07:14 -03:00
Pablo Hoffman
94ead94bf6
Improved documentation of Scrapy command-line tool
...
--HG--
rename : docs/topics/cmdline.rst => docs/topics/commands.rst
2010-08-19 00:04:52 -03:00
Pablo Hoffman
34554da201
Deprecated scrapy-ctl.py command in favour of simpler "scrapy" command. Closes #199 . Also updated documenation accordingly and added convenient scrapy.bat script for running from Windows.
...
--HG--
rename : debian/scrapy-ctl.1 => debian/scrapy.1
rename : docs/topics/scrapy-ctl.rst => docs/topics/cmdline.rst
2010-08-18 19:48:32 -03:00
Daniel Grana
5522afca8d
pipeline process_item methods decorated with inlineCallbacks fails because of backwards compatible change to support inverted arguments
2010-08-18 13:05:50 -03:00
Pablo Hoffman
50f2236d04
removed hacky command_executed signal
2010-08-17 18:31:16 -03:00
Pablo Hoffman
a71521bfba
Default per-command settings are now specified in the default_settings attribute of the command object. Closes #201
2010-08-17 18:30:13 -03:00
Pablo Hoffman
76e19baa29
minor setting fix
2010-08-17 14:48:38 -03:00
Pablo Hoffman
ad3fd0afe8
fixed minor formatting issue with new feed exports doc
2010-08-17 14:37:59 -03:00
Pablo Hoffman
e741a807d2
Added new Feed exports extension with documentation and storage tests. Closes #197 .
...
Also deprecated File export pipeline (to be removed in Scrapy 0.11).
Still need to add tests for FeedExport main extension code.
2010-08-17 14:27:48 -03:00
Pablo Hoffman
d17695ee4b
Added "scrapy" command with project settings auto-discovery. Refs #199
2010-08-17 00:59:24 -03:00
Pablo Hoffman
b563e56363
fixed bug in url_is_from_spider() when no allowed_domains class attribute is present
2010-08-16 12:25:51 -03:00
Pablo Hoffman
a28cf29008
Simplified BaseSpider code by removing backwards compatibility code
2010-08-16 10:10:57 -03:00
Pablo Hoffman
3e3a66620b
Added support for returning deferreds from (some) signal handlers. Closes #193
2010-08-14 21:10:37 -03:00
Pablo Hoffman
5755bdcddc
Improve spider errors logging which were previously logged as confusing "Unhandled errors" - closes #196
2010-08-13 01:45:47 -03:00
Pablo Hoffman
1df2c17b78
updated old documentation references
2010-08-12 20:45:11 -03:00
Pablo Hoffman
43d47e5d9b
Some improvements to Item Pipeline ( closes #195 ):
...
* Made Item Pipeline Manager a subclass of scrapy.middleware.MiddlewareManager
* Added open_spider/close_spider methods with support for returning deferreds from them
* Inverted the process_item() arguments to be more friendly with deferred
callbacks (backwards compatibility kept through arguments introspection)
* Updated documentation with new methods and process_item() arguments change
2010-08-12 10:48:37 -03:00
Pablo Hoffman
2167bfb20d
added prepend_level argument to log._adapt_eventdict()
2010-08-12 10:26:50 -03:00
Pablo Hoffman
ae692c05b7
made MiddlewareManager an abstract class, and minor change to log message
2010-08-12 03:04:19 -03:00
Pablo Hoffman
192ec5e35b
Error logging improvements
2010-08-12 00:33:40 -03:00
Pablo Hoffman
6d21f2f23a
removed reference to old middleware
2010-08-10 18:27:58 -03:00
Pablo Hoffman
68ae1da0d0
removed unused import
2010-08-10 18:26:40 -03:00
Pablo Hoffman
6a0ef0e41c
removed unused import
2010-08-10 18:22:05 -03:00
Pablo Hoffman
0f342417b9
remove old unsupported item sampler middleware
2010-08-10 18:15:28 -03:00
Pablo Hoffman
2822db730f
added new MiddlewareManager class that will be used as base class for pipeline and middlewares
2010-08-10 18:03:47 -03:00
Pablo Hoffman
9d38a99aa8
updated missing doc reference from previous commit
2010-08-10 17:47:04 -03:00
Pablo Hoffman
dedd6b2256
added more information to deprecation notice
2010-08-10 17:42:05 -03:00
Pablo Hoffman
784722774b
moved scrapy.core.signals to scrapy.signals, keeping backwards compatibility
2010-08-10 17:40:53 -03:00
Pablo Hoffman
c359a34d7d
moved scrapy.core.exceptions to scrapy.exceptions, keeping backwards compatibility
...
--HG--
rename : scrapy/core/exceptions.py => scrapy/exceptions.py
2010-08-10 17:36:48 -03:00
Pablo Hoffman
355c615c2c
removed old unsupported SpiderProfiler extension
2010-08-10 17:23:22 -03:00
Pablo Hoffman
b1c0280616
removed scheduler middleware doc, as scheduler middleware will be removed soon
2010-08-10 16:59:49 -03:00
Pablo Hoffman
5aa8e63957
Removed 'sender' argument when sending signals, as we're not sending it consistently, and it's not being used by receivers either
2010-08-09 14:41:54 -03:00
Pablo Hoffman
3d121898e2
add signal handler name when logging errors
2010-08-09 13:32:44 -03:00
Pablo Hoffman
a08e62a25d
silence irrelevant (and confusing) errors generated in tests by signals left active after engine tests run - we should really rewrite engine tests asap
2010-08-09 13:24:22 -03:00
Pablo Hoffman
bb2e0de7da
fixed utils.signal tests broken in previous commit
2010-08-09 13:22:57 -03:00
Pablo Hoffman
8de0dc3647
Log full traceback of signal handler errors in send_catch_log() - closes #194 . Also made engine use send_catch_log for spider_idle signal
2010-08-09 12:05:03 -03:00
Pablo Hoffman
2aa84073ab
moved scrapy log observer logic into a separate function
2010-08-09 11:09:07 -03:00
Pablo Hoffman
5139843c5c
avoid noisy KeyError in enqueue_scrape, when closing spiders manually
2010-08-09 11:07:45 -03:00
Pablo Hoffman
67e42d1bfd
Moved scrapy/command/__init__.py to scrapy/command.py
...
--HG--
rename : scrapy/command/__init__.py => scrapy/command.py
2010-08-08 07:30:03 -03:00
Pablo Hoffman
c7d9f6e270
Added JSON item exporter with doc and unittests ( closes #192 ), and also:
...
* put all json exporters in scrapy.contrib.exporters and deprecated
scrapy.contrib.exporters.jsonlines to reduce module nesting
* use JSON exporter with EXPORT_FORMAT=json in file export pipeline
2010-08-07 15:52:59 -03:00
Pablo Hoffman
ba2369bfa1
changed variable names for clarity
2010-08-06 15:05:58 -03:00
Pablo Hoffman
35e6c8725b
Added handles_request() class method to BaseSpider - closes #191
2010-08-06 14:59:18 -03:00
Pablo Hoffman
4d66f4c6f8
Added support for logging twisted errors generated outside of Scrapy - refs #188
2010-08-05 20:46:54 -03:00
Pablo Hoffman
c82260f5ec
make runtests.sh more virtualenv-friendly
2010-08-05 13:31:19 -03:00
Pablo Hoffman
daa9506b34
Fixed bug with non-keepalive execution queues ( closes #190 )
2010-08-04 14:20:45 -03:00
Pablo Hoffman
49851d7f55
Automated merge with http://hg.scrapy.org/scrapy-0.9
2010-08-02 17:20:55 -03:00