Pablo Hoffman
ad3fd0afe8
fixed minor formatting issue with new feed exports doc
2010-08-17 14:37:59 -03:00
Pablo Hoffman
e741a807d2
Added new Feed exports extension with documentation and storage tests. Closes #197 .
...
Also deprecated File export pipeline (to be removed in Scrapy 0.11).
Still need to add tests for FeedExport main extension code.
2010-08-17 14:27:48 -03:00
Pablo Hoffman
d17695ee4b
Added "scrapy" command with project settings auto-discovery. Refs #199
2010-08-17 00:59:24 -03:00
Pablo Hoffman
b563e56363
fixed bug in url_is_from_spider() when no allowed_domains class attribute is present
2010-08-16 12:25:51 -03:00
Pablo Hoffman
a28cf29008
Simplified BaseSpider code by removing backwards compatibility code
2010-08-16 10:10:57 -03:00
Pablo Hoffman
3e3a66620b
Added support for returning deferreds from (some) signal handlers. Closes #193
2010-08-14 21:10:37 -03:00
Pablo Hoffman
5755bdcddc
Improve spider errors logging which were previously logged as confusing "Unhandled errors" - closes #196
2010-08-13 01:45:47 -03:00
Pablo Hoffman
1df2c17b78
updated old documentation references
2010-08-12 20:45:11 -03:00
Pablo Hoffman
43d47e5d9b
Some improvements to Item Pipeline ( closes #195 ):
...
* Made Item Pipeline Manager a subclass of scrapy.middleware.MiddlewareManager
* Added open_spider/close_spider methods with support for returning deferreds from them
* Inverted the process_item() arguments to be more friendly with deferred
callbacks (backwards compatibility kept through arguments introspection)
* Updated documentation with new methods and process_item() arguments change
2010-08-12 10:48:37 -03:00
Pablo Hoffman
2167bfb20d
added prepend_level argument to log._adapt_eventdict()
2010-08-12 10:26:50 -03:00
Pablo Hoffman
ae692c05b7
made MiddlewareManager an abstract class, and minor change to log message
2010-08-12 03:04:19 -03:00
Pablo Hoffman
192ec5e35b
Error logging improvements
2010-08-12 00:33:40 -03:00
Pablo Hoffman
6d21f2f23a
removed reference to old middleware
2010-08-10 18:27:58 -03:00
Pablo Hoffman
68ae1da0d0
removed unused import
2010-08-10 18:26:40 -03:00
Pablo Hoffman
6a0ef0e41c
removed unused import
2010-08-10 18:22:05 -03:00
Pablo Hoffman
0f342417b9
remove old unsupported item sampler middleware
2010-08-10 18:15:28 -03:00
Pablo Hoffman
2822db730f
added new MiddlewareManager class that will be used as base class for pipeline and middlewares
2010-08-10 18:03:47 -03:00
Pablo Hoffman
9d38a99aa8
updated missing doc reference from previous commit
2010-08-10 17:47:04 -03:00
Pablo Hoffman
dedd6b2256
added more information to deprecation notice
2010-08-10 17:42:05 -03:00
Pablo Hoffman
784722774b
moved scrapy.core.signals to scrapy.signals, keeping backwards compatibility
2010-08-10 17:40:53 -03:00
Pablo Hoffman
c359a34d7d
moved scrapy.core.exceptions to scrapy.exceptions, keeping backwards compatibility
...
--HG--
rename : scrapy/core/exceptions.py => scrapy/exceptions.py
2010-08-10 17:36:48 -03:00
Pablo Hoffman
355c615c2c
removed old unsupported SpiderProfiler extension
2010-08-10 17:23:22 -03:00
Pablo Hoffman
b1c0280616
removed scheduler middleware doc, as scheduler middleware will be removed soon
2010-08-10 16:59:49 -03:00
Pablo Hoffman
5aa8e63957
Removed 'sender' argument when sending signals, as we're not sending it consistently, and it's not being used by receivers either
2010-08-09 14:41:54 -03:00
Pablo Hoffman
3d121898e2
add signal handler name when logging errors
2010-08-09 13:32:44 -03:00
Pablo Hoffman
a08e62a25d
silence irrelevant (and confusing) errors generated in tests by signals left active after engine tests run - we should really rewrite engine tests asap
2010-08-09 13:24:22 -03:00
Pablo Hoffman
bb2e0de7da
fixed utils.signal tests broken in previous commit
2010-08-09 13:22:57 -03:00
Pablo Hoffman
8de0dc3647
Log full traceback of signal handler errors in send_catch_log() - closes #194 . Also made engine use send_catch_log for spider_idle signal
2010-08-09 12:05:03 -03:00
Pablo Hoffman
2aa84073ab
moved scrapy log observer logic into a separate function
2010-08-09 11:09:07 -03:00
Pablo Hoffman
5139843c5c
avoid noisy KeyError in enqueue_scrape, when closing spiders manually
2010-08-09 11:07:45 -03:00
Pablo Hoffman
67e42d1bfd
Moved scrapy/command/__init__.py to scrapy/command.py
...
--HG--
rename : scrapy/command/__init__.py => scrapy/command.py
2010-08-08 07:30:03 -03:00
Pablo Hoffman
c7d9f6e270
Added JSON item exporter with doc and unittests ( closes #192 ), and also:
...
* put all json exporters in scrapy.contrib.exporters and deprecated
scrapy.contrib.exporters.jsonlines to reduce module nesting
* use JSON exporter with EXPORT_FORMAT=json in file export pipeline
2010-08-07 15:52:59 -03:00
Pablo Hoffman
ba2369bfa1
changed variable names for clarity
2010-08-06 15:05:58 -03:00
Pablo Hoffman
35e6c8725b
Added handles_request() class method to BaseSpider - closes #191
2010-08-06 14:59:18 -03:00
Pablo Hoffman
4d66f4c6f8
Added support for logging twisted errors generated outside of Scrapy - refs #188
2010-08-05 20:46:54 -03:00
Pablo Hoffman
c82260f5ec
make runtests.sh more virtualenv-friendly
2010-08-05 13:31:19 -03:00
Pablo Hoffman
daa9506b34
Fixed bug with non-keepalive execution queues ( closes #190 )
2010-08-04 14:20:45 -03:00
Pablo Hoffman
49851d7f55
Automated merge with http://hg.scrapy.org/scrapy-0.9
2010-08-02 17:20:55 -03:00
Pablo Hoffman
6c68e4ce15
fixed documentation typo
2010-08-02 17:20:13 -03:00
Pablo Hoffman
b2878fbbef
scrapy.utils.url_is_from_spider() - Consider spider name as possible domain for matching in scrapy.utils.url_is_from_spider()
2010-08-02 12:16:36 -03:00
Pablo Hoffman
453e7bf38c
Scrapy logging refactoring ( closes #188 ):
...
* added Twisted log observer for Scrapy, with unittests
* use numeric values from Python logging module for log levels
* removed scrapy.log.exc() function - use scrapy.log.err() instead
* removed logmessage_received signal - write a (twisted) log observer instead
* dropped support for obsolete `domain` argument
* dropped support for old setting names: LOGLEVEL, LOGFILE (replaced by LOG_LEVEL, LOG_FILE)
* deprecated `component` argument
2010-08-02 08:49:14 -03:00
Pablo Hoffman
c5fd113c09
minor fix to path name
2010-07-31 16:02:59 -03:00
Pablo Hoffman
f6eac8e348
Splitted single Debian package into two packages ( closes #187 ):
...
- scrapy: which provides only the library and scrapy-ctl command
- scrapy-service: which provides the service, upstart script, system user, etc
This allows a clean install of just the library for those which are not
interested in the Scrapy service.
--HG--
rename : debian/scrapy.dirs => debian/scrapy-service.dirs
rename : debian/scrapy.install => debian/scrapy-service.install
rename : debian/scrapy.postinst => debian/scrapy-service.postinst
rename : debian/scrapy.postrm => debian/scrapy-service.postrm
rename : debian/scrapy.upstart => debian/scrapy-service.upstart
rename : debian/conf/service_conf.py => debian/service_conf.py
2010-07-31 15:50:12 -03:00
Ismael Carnales
e145ec686c
Replaced default spider manager (TwistedPluginSpiderManger) with a simpler one that doesn't depend on Twisted Plugins infrastructure.
2010-07-30 17:30:32 -03:00
Pablo Hoffman
65c3de8e6a
Rewritten walk_modules function to support eggs, and added tests
2010-07-30 17:05:55 -03:00
Ismael Carnales
6d06824488
utils: Add walk_packages utility function
...
--HG--
rename : scrapy/tests/test_utils_misc.py => scrapy/tests/test_utils_misc/__init__.py
2010-07-30 15:53:24 -03:00
Ismael Carnales
9511a3154e
Fix error message when spider not found in parse command
2010-07-30 14:49:07 -03:00
Pablo Hoffman
e112def754
removed old untested module: scrapy.utils.mysql
2010-07-29 12:11:07 -03:00
Pablo Hoffman
6312826220
removed (no longer supported) webconsole code
2010-07-29 12:03:02 -03:00
molveyra
936ffe5e26
Automated merge with ssh://hg@hg.scrapy.org:2222/scrapy
2010-07-28 11:28:52 -03:00