1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-28 15:07:05 +00:00

2327 Commits

Author SHA1 Message Date
Pablo Hoffman
66525e77c7 Improved support for scrapy-ctl -> scrapy migration by generating the scrapy.cfg file automatically if it doensn't exist 2010-08-19 17:57:34 -03:00
Pablo Hoffman
3d8151bb26 Added FAQ entry about response code 999 2010-08-19 16:51:51 -03:00
Pablo Hoffman
2ff5a83b7a Added persistent execution queue (based on SQLite), and a new 'queue' command to control it. Closes #198 2010-08-19 02:55:52 -03:00
Pablo Hoffman
9740ad62a6 Added tests for runspider command 2010-08-19 02:30:15 -03:00
Pablo Hoffman
abd7a5e221 Fixed bug with runspider command that appeared after the introduction of new spider manager in r2121. Also factored out common code shared by new spider manager and runspider command, with tests included. 2010-08-19 01:58:10 -03:00
Pablo Hoffman
85f3a4a603 removed obsolete setting 2010-08-19 00:07:14 -03:00
Pablo Hoffman
94ead94bf6 Improved documentation of Scrapy command-line tool
--HG--
rename : docs/topics/cmdline.rst => docs/topics/commands.rst
2010-08-19 00:04:52 -03:00
Pablo Hoffman
34554da201 Deprecated scrapy-ctl.py command in favour of simpler "scrapy" command. Closes #199. Also updated documenation accordingly and added convenient scrapy.bat script for running from Windows.
--HG--
rename : debian/scrapy-ctl.1 => debian/scrapy.1
rename : docs/topics/scrapy-ctl.rst => docs/topics/cmdline.rst
2010-08-18 19:48:32 -03:00
Daniel Grana
5522afca8d pipeline process_item methods decorated with inlineCallbacks fails because of backwards compatible change to support inverted arguments 2010-08-18 13:05:50 -03:00
Pablo Hoffman
50f2236d04 removed hacky command_executed signal 2010-08-17 18:31:16 -03:00
Pablo Hoffman
a71521bfba Default per-command settings are now specified in the default_settings attribute of the command object. Closes #201 2010-08-17 18:30:13 -03:00
Pablo Hoffman
76e19baa29 minor setting fix 2010-08-17 14:48:38 -03:00
Pablo Hoffman
ad3fd0afe8 fixed minor formatting issue with new feed exports doc 2010-08-17 14:37:59 -03:00
Pablo Hoffman
e741a807d2 Added new Feed exports extension with documentation and storage tests. Closes #197.
Also deprecated File export pipeline (to be removed in Scrapy 0.11).

Still need to add tests for FeedExport main extension code.
2010-08-17 14:27:48 -03:00
Pablo Hoffman
d17695ee4b Added "scrapy" command with project settings auto-discovery. Refs #199 2010-08-17 00:59:24 -03:00
Pablo Hoffman
b563e56363 fixed bug in url_is_from_spider() when no allowed_domains class attribute is present 2010-08-16 12:25:51 -03:00
Pablo Hoffman
a28cf29008 Simplified BaseSpider code by removing backwards compatibility code 2010-08-16 10:10:57 -03:00
Pablo Hoffman
3e3a66620b Added support for returning deferreds from (some) signal handlers. Closes #193 2010-08-14 21:10:37 -03:00
Pablo Hoffman
5755bdcddc Improve spider errors logging which were previously logged as confusing "Unhandled errors" - closes #196 2010-08-13 01:45:47 -03:00
Pablo Hoffman
1df2c17b78 updated old documentation references 2010-08-12 20:45:11 -03:00
Pablo Hoffman
43d47e5d9b Some improvements to Item Pipeline (closes #195):
* Made Item Pipeline Manager a subclass of scrapy.middleware.MiddlewareManager
* Added open_spider/close_spider methods with support for returning deferreds from them
* Inverted the process_item() arguments to be more friendly with deferred
  callbacks (backwards compatibility kept through arguments introspection)
* Updated documentation with new methods and process_item() arguments change
2010-08-12 10:48:37 -03:00
Pablo Hoffman
2167bfb20d added prepend_level argument to log._adapt_eventdict() 2010-08-12 10:26:50 -03:00
Pablo Hoffman
ae692c05b7 made MiddlewareManager an abstract class, and minor change to log message 2010-08-12 03:04:19 -03:00
Pablo Hoffman
192ec5e35b Error logging improvements 2010-08-12 00:33:40 -03:00
Pablo Hoffman
6d21f2f23a removed reference to old middleware 2010-08-10 18:27:58 -03:00
Pablo Hoffman
68ae1da0d0 removed unused import 2010-08-10 18:26:40 -03:00
Pablo Hoffman
6a0ef0e41c removed unused import 2010-08-10 18:22:05 -03:00
Pablo Hoffman
0f342417b9 remove old unsupported item sampler middleware 2010-08-10 18:15:28 -03:00
Pablo Hoffman
2822db730f added new MiddlewareManager class that will be used as base class for pipeline and middlewares 2010-08-10 18:03:47 -03:00
Pablo Hoffman
9d38a99aa8 updated missing doc reference from previous commit 2010-08-10 17:47:04 -03:00
Pablo Hoffman
dedd6b2256 added more information to deprecation notice 2010-08-10 17:42:05 -03:00
Pablo Hoffman
784722774b moved scrapy.core.signals to scrapy.signals, keeping backwards compatibility 2010-08-10 17:40:53 -03:00
Pablo Hoffman
c359a34d7d moved scrapy.core.exceptions to scrapy.exceptions, keeping backwards compatibility
--HG--
rename : scrapy/core/exceptions.py => scrapy/exceptions.py
2010-08-10 17:36:48 -03:00
Pablo Hoffman
355c615c2c removed old unsupported SpiderProfiler extension 2010-08-10 17:23:22 -03:00
Pablo Hoffman
b1c0280616 removed scheduler middleware doc, as scheduler middleware will be removed soon 2010-08-10 16:59:49 -03:00
Pablo Hoffman
5aa8e63957 Removed 'sender' argument when sending signals, as we're not sending it consistently, and it's not being used by receivers either 2010-08-09 14:41:54 -03:00
Pablo Hoffman
3d121898e2 add signal handler name when logging errors 2010-08-09 13:32:44 -03:00
Pablo Hoffman
a08e62a25d silence irrelevant (and confusing) errors generated in tests by signals left active after engine tests run - we should really rewrite engine tests asap 2010-08-09 13:24:22 -03:00
Pablo Hoffman
bb2e0de7da fixed utils.signal tests broken in previous commit 2010-08-09 13:22:57 -03:00
Pablo Hoffman
8de0dc3647 Log full traceback of signal handler errors in send_catch_log() - closes #194. Also made engine use send_catch_log for spider_idle signal 2010-08-09 12:05:03 -03:00
Pablo Hoffman
2aa84073ab moved scrapy log observer logic into a separate function 2010-08-09 11:09:07 -03:00
Pablo Hoffman
5139843c5c avoid noisy KeyError in enqueue_scrape, when closing spiders manually 2010-08-09 11:07:45 -03:00
Pablo Hoffman
67e42d1bfd Moved scrapy/command/__init__.py to scrapy/command.py
--HG--
rename : scrapy/command/__init__.py => scrapy/command.py
2010-08-08 07:30:03 -03:00
Pablo Hoffman
c7d9f6e270 Added JSON item exporter with doc and unittests (closes #192), and also:
* put all json exporters in scrapy.contrib.exporters and deprecated
  scrapy.contrib.exporters.jsonlines to reduce module nesting
* use JSON exporter with EXPORT_FORMAT=json in file export pipeline
2010-08-07 15:52:59 -03:00
Pablo Hoffman
ba2369bfa1 changed variable names for clarity 2010-08-06 15:05:58 -03:00
Pablo Hoffman
35e6c8725b Added handles_request() class method to BaseSpider - closes #191 2010-08-06 14:59:18 -03:00
Pablo Hoffman
4d66f4c6f8 Added support for logging twisted errors generated outside of Scrapy - refs #188 2010-08-05 20:46:54 -03:00
Pablo Hoffman
c82260f5ec make runtests.sh more virtualenv-friendly 2010-08-05 13:31:19 -03:00
Pablo Hoffman
daa9506b34 Fixed bug with non-keepalive execution queues (closes #190) 2010-08-04 14:20:45 -03:00
Pablo Hoffman
49851d7f55 Automated merge with http://hg.scrapy.org/scrapy-0.9 2010-08-02 17:20:55 -03:00