1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-25 20:23:56 +00:00

2169 Commits

Author SHA1 Message Date
Daniel Grana
5522afca8d pipeline process_item methods decorated with inlineCallbacks fails because of backwards compatible change to support inverted arguments 2010-08-18 13:05:50 -03:00
Pablo Hoffman
50f2236d04 removed hacky command_executed signal 2010-08-17 18:31:16 -03:00
Pablo Hoffman
a71521bfba Default per-command settings are now specified in the default_settings attribute of the command object. Closes #201 2010-08-17 18:30:13 -03:00
Pablo Hoffman
76e19baa29 minor setting fix 2010-08-17 14:48:38 -03:00
Pablo Hoffman
ad3fd0afe8 fixed minor formatting issue with new feed exports doc 2010-08-17 14:37:59 -03:00
Pablo Hoffman
e741a807d2 Added new Feed exports extension with documentation and storage tests. Closes #197.
Also deprecated File export pipeline (to be removed in Scrapy 0.11).

Still need to add tests for FeedExport main extension code.
2010-08-17 14:27:48 -03:00
Pablo Hoffman
d17695ee4b Added "scrapy" command with project settings auto-discovery. Refs #199 2010-08-17 00:59:24 -03:00
Pablo Hoffman
b563e56363 fixed bug in url_is_from_spider() when no allowed_domains class attribute is present 2010-08-16 12:25:51 -03:00
Pablo Hoffman
a28cf29008 Simplified BaseSpider code by removing backwards compatibility code 2010-08-16 10:10:57 -03:00
Pablo Hoffman
3e3a66620b Added support for returning deferreds from (some) signal handlers. Closes #193 2010-08-14 21:10:37 -03:00
Pablo Hoffman
5755bdcddc Improve spider errors logging which were previously logged as confusing "Unhandled errors" - closes #196 2010-08-13 01:45:47 -03:00
Pablo Hoffman
1df2c17b78 updated old documentation references 2010-08-12 20:45:11 -03:00
Pablo Hoffman
43d47e5d9b Some improvements to Item Pipeline (closes #195):
* Made Item Pipeline Manager a subclass of scrapy.middleware.MiddlewareManager
* Added open_spider/close_spider methods with support for returning deferreds from them
* Inverted the process_item() arguments to be more friendly with deferred
  callbacks (backwards compatibility kept through arguments introspection)
* Updated documentation with new methods and process_item() arguments change
2010-08-12 10:48:37 -03:00
Pablo Hoffman
2167bfb20d added prepend_level argument to log._adapt_eventdict() 2010-08-12 10:26:50 -03:00
Pablo Hoffman
ae692c05b7 made MiddlewareManager an abstract class, and minor change to log message 2010-08-12 03:04:19 -03:00
Pablo Hoffman
192ec5e35b Error logging improvements 2010-08-12 00:33:40 -03:00
Pablo Hoffman
6d21f2f23a removed reference to old middleware 2010-08-10 18:27:58 -03:00
Pablo Hoffman
68ae1da0d0 removed unused import 2010-08-10 18:26:40 -03:00
Pablo Hoffman
6a0ef0e41c removed unused import 2010-08-10 18:22:05 -03:00
Pablo Hoffman
0f342417b9 remove old unsupported item sampler middleware 2010-08-10 18:15:28 -03:00
Pablo Hoffman
2822db730f added new MiddlewareManager class that will be used as base class for pipeline and middlewares 2010-08-10 18:03:47 -03:00
Pablo Hoffman
9d38a99aa8 updated missing doc reference from previous commit 2010-08-10 17:47:04 -03:00
Pablo Hoffman
dedd6b2256 added more information to deprecation notice 2010-08-10 17:42:05 -03:00
Pablo Hoffman
784722774b moved scrapy.core.signals to scrapy.signals, keeping backwards compatibility 2010-08-10 17:40:53 -03:00
Pablo Hoffman
c359a34d7d moved scrapy.core.exceptions to scrapy.exceptions, keeping backwards compatibility
--HG--
rename : scrapy/core/exceptions.py => scrapy/exceptions.py
2010-08-10 17:36:48 -03:00
Pablo Hoffman
355c615c2c removed old unsupported SpiderProfiler extension 2010-08-10 17:23:22 -03:00
Pablo Hoffman
b1c0280616 removed scheduler middleware doc, as scheduler middleware will be removed soon 2010-08-10 16:59:49 -03:00
Pablo Hoffman
5aa8e63957 Removed 'sender' argument when sending signals, as we're not sending it consistently, and it's not being used by receivers either 2010-08-09 14:41:54 -03:00
Pablo Hoffman
3d121898e2 add signal handler name when logging errors 2010-08-09 13:32:44 -03:00
Pablo Hoffman
a08e62a25d silence irrelevant (and confusing) errors generated in tests by signals left active after engine tests run - we should really rewrite engine tests asap 2010-08-09 13:24:22 -03:00
Pablo Hoffman
bb2e0de7da fixed utils.signal tests broken in previous commit 2010-08-09 13:22:57 -03:00
Pablo Hoffman
8de0dc3647 Log full traceback of signal handler errors in send_catch_log() - closes #194. Also made engine use send_catch_log for spider_idle signal 2010-08-09 12:05:03 -03:00
Pablo Hoffman
2aa84073ab moved scrapy log observer logic into a separate function 2010-08-09 11:09:07 -03:00
Pablo Hoffman
5139843c5c avoid noisy KeyError in enqueue_scrape, when closing spiders manually 2010-08-09 11:07:45 -03:00
Pablo Hoffman
67e42d1bfd Moved scrapy/command/__init__.py to scrapy/command.py
--HG--
rename : scrapy/command/__init__.py => scrapy/command.py
2010-08-08 07:30:03 -03:00
Pablo Hoffman
c7d9f6e270 Added JSON item exporter with doc and unittests (closes #192), and also:
* put all json exporters in scrapy.contrib.exporters and deprecated
  scrapy.contrib.exporters.jsonlines to reduce module nesting
* use JSON exporter with EXPORT_FORMAT=json in file export pipeline
2010-08-07 15:52:59 -03:00
Pablo Hoffman
ba2369bfa1 changed variable names for clarity 2010-08-06 15:05:58 -03:00
Pablo Hoffman
35e6c8725b Added handles_request() class method to BaseSpider - closes #191 2010-08-06 14:59:18 -03:00
Pablo Hoffman
4d66f4c6f8 Added support for logging twisted errors generated outside of Scrapy - refs #188 2010-08-05 20:46:54 -03:00
Pablo Hoffman
c82260f5ec make runtests.sh more virtualenv-friendly 2010-08-05 13:31:19 -03:00
Pablo Hoffman
daa9506b34 Fixed bug with non-keepalive execution queues (closes #190) 2010-08-04 14:20:45 -03:00
Pablo Hoffman
49851d7f55 Automated merge with http://hg.scrapy.org/scrapy-0.9 2010-08-02 17:20:55 -03:00
Pablo Hoffman
6c68e4ce15 fixed documentation typo 2010-08-02 17:20:13 -03:00
Pablo Hoffman
b2878fbbef scrapy.utils.url_is_from_spider() - Consider spider name as possible domain for matching in scrapy.utils.url_is_from_spider() 2010-08-02 12:16:36 -03:00
Pablo Hoffman
453e7bf38c Scrapy logging refactoring (closes #188):
* added Twisted log observer for Scrapy, with unittests
 * use numeric values from Python logging module for log levels
 * removed scrapy.log.exc() function - use scrapy.log.err() instead
 * removed logmessage_received signal - write a (twisted) log observer instead
 * dropped support for obsolete `domain` argument
 * dropped support for old setting names: LOGLEVEL, LOGFILE (replaced by LOG_LEVEL, LOG_FILE)
 * deprecated `component` argument
2010-08-02 08:49:14 -03:00
Pablo Hoffman
c5fd113c09 minor fix to path name 2010-07-31 16:02:59 -03:00
Pablo Hoffman
f6eac8e348 Splitted single Debian package into two packages (closes #187):
- scrapy: which provides only the library and scrapy-ctl command
- scrapy-service: which provides the service, upstart script, system user, etc

This allows a clean install of just the library for those which are not
interested in the Scrapy service.

--HG--
rename : debian/scrapy.dirs => debian/scrapy-service.dirs
rename : debian/scrapy.install => debian/scrapy-service.install
rename : debian/scrapy.postinst => debian/scrapy-service.postinst
rename : debian/scrapy.postrm => debian/scrapy-service.postrm
rename : debian/scrapy.upstart => debian/scrapy-service.upstart
rename : debian/conf/service_conf.py => debian/service_conf.py
2010-07-31 15:50:12 -03:00
Ismael Carnales
e145ec686c Replaced default spider manager (TwistedPluginSpiderManger) with a simpler one that doesn't depend on Twisted Plugins infrastructure. 2010-07-30 17:30:32 -03:00
Pablo Hoffman
65c3de8e6a Rewritten walk_modules function to support eggs, and added tests 2010-07-30 17:05:55 -03:00
Ismael Carnales
6d06824488 utils: Add walk_packages utility function
--HG--
rename : scrapy/tests/test_utils_misc.py => scrapy/tests/test_utils_misc/__init__.py
2010-07-30 15:53:24 -03:00