Pablo Hoffman
192ec5e35b
Error logging improvements
2010-08-12 00:33:40 -03:00
Pablo Hoffman
6d21f2f23a
removed reference to old middleware
2010-08-10 18:27:58 -03:00
Pablo Hoffman
68ae1da0d0
removed unused import
2010-08-10 18:26:40 -03:00
Pablo Hoffman
6a0ef0e41c
removed unused import
2010-08-10 18:22:05 -03:00
Pablo Hoffman
0f342417b9
remove old unsupported item sampler middleware
2010-08-10 18:15:28 -03:00
Pablo Hoffman
2822db730f
added new MiddlewareManager class that will be used as base class for pipeline and middlewares
2010-08-10 18:03:47 -03:00
Pablo Hoffman
9d38a99aa8
updated missing doc reference from previous commit
2010-08-10 17:47:04 -03:00
Pablo Hoffman
dedd6b2256
added more information to deprecation notice
2010-08-10 17:42:05 -03:00
Pablo Hoffman
784722774b
moved scrapy.core.signals to scrapy.signals, keeping backwards compatibility
2010-08-10 17:40:53 -03:00
Pablo Hoffman
c359a34d7d
moved scrapy.core.exceptions to scrapy.exceptions, keeping backwards compatibility
...
--HG--
rename : scrapy/core/exceptions.py => scrapy/exceptions.py
2010-08-10 17:36:48 -03:00
Pablo Hoffman
355c615c2c
removed old unsupported SpiderProfiler extension
2010-08-10 17:23:22 -03:00
Pablo Hoffman
b1c0280616
removed scheduler middleware doc, as scheduler middleware will be removed soon
2010-08-10 16:59:49 -03:00
Pablo Hoffman
5aa8e63957
Removed 'sender' argument when sending signals, as we're not sending it consistently, and it's not being used by receivers either
2010-08-09 14:41:54 -03:00
Pablo Hoffman
3d121898e2
add signal handler name when logging errors
2010-08-09 13:32:44 -03:00
Pablo Hoffman
a08e62a25d
silence irrelevant (and confusing) errors generated in tests by signals left active after engine tests run - we should really rewrite engine tests asap
2010-08-09 13:24:22 -03:00
Pablo Hoffman
bb2e0de7da
fixed utils.signal tests broken in previous commit
2010-08-09 13:22:57 -03:00
Pablo Hoffman
8de0dc3647
Log full traceback of signal handler errors in send_catch_log() - closes #194 . Also made engine use send_catch_log for spider_idle signal
2010-08-09 12:05:03 -03:00
Pablo Hoffman
2aa84073ab
moved scrapy log observer logic into a separate function
2010-08-09 11:09:07 -03:00
Pablo Hoffman
5139843c5c
avoid noisy KeyError in enqueue_scrape, when closing spiders manually
2010-08-09 11:07:45 -03:00
Pablo Hoffman
67e42d1bfd
Moved scrapy/command/__init__.py to scrapy/command.py
...
--HG--
rename : scrapy/command/__init__.py => scrapy/command.py
2010-08-08 07:30:03 -03:00
Pablo Hoffman
c7d9f6e270
Added JSON item exporter with doc and unittests ( closes #192 ), and also:
...
* put all json exporters in scrapy.contrib.exporters and deprecated
scrapy.contrib.exporters.jsonlines to reduce module nesting
* use JSON exporter with EXPORT_FORMAT=json in file export pipeline
2010-08-07 15:52:59 -03:00
Pablo Hoffman
ba2369bfa1
changed variable names for clarity
2010-08-06 15:05:58 -03:00
Pablo Hoffman
35e6c8725b
Added handles_request() class method to BaseSpider - closes #191
2010-08-06 14:59:18 -03:00
Pablo Hoffman
4d66f4c6f8
Added support for logging twisted errors generated outside of Scrapy - refs #188
2010-08-05 20:46:54 -03:00
Pablo Hoffman
c82260f5ec
make runtests.sh more virtualenv-friendly
2010-08-05 13:31:19 -03:00
Pablo Hoffman
daa9506b34
Fixed bug with non-keepalive execution queues ( closes #190 )
2010-08-04 14:20:45 -03:00
Pablo Hoffman
49851d7f55
Automated merge with http://hg.scrapy.org/scrapy-0.9
2010-08-02 17:20:55 -03:00
Pablo Hoffman
6c68e4ce15
fixed documentation typo
2010-08-02 17:20:13 -03:00
Pablo Hoffman
b2878fbbef
scrapy.utils.url_is_from_spider() - Consider spider name as possible domain for matching in scrapy.utils.url_is_from_spider()
2010-08-02 12:16:36 -03:00
Pablo Hoffman
453e7bf38c
Scrapy logging refactoring ( closes #188 ):
...
* added Twisted log observer for Scrapy, with unittests
* use numeric values from Python logging module for log levels
* removed scrapy.log.exc() function - use scrapy.log.err() instead
* removed logmessage_received signal - write a (twisted) log observer instead
* dropped support for obsolete `domain` argument
* dropped support for old setting names: LOGLEVEL, LOGFILE (replaced by LOG_LEVEL, LOG_FILE)
* deprecated `component` argument
2010-08-02 08:49:14 -03:00
Pablo Hoffman
c5fd113c09
minor fix to path name
2010-07-31 16:02:59 -03:00
Pablo Hoffman
f6eac8e348
Splitted single Debian package into two packages ( closes #187 ):
...
- scrapy: which provides only the library and scrapy-ctl command
- scrapy-service: which provides the service, upstart script, system user, etc
This allows a clean install of just the library for those which are not
interested in the Scrapy service.
--HG--
rename : debian/scrapy.dirs => debian/scrapy-service.dirs
rename : debian/scrapy.install => debian/scrapy-service.install
rename : debian/scrapy.postinst => debian/scrapy-service.postinst
rename : debian/scrapy.postrm => debian/scrapy-service.postrm
rename : debian/scrapy.upstart => debian/scrapy-service.upstart
rename : debian/conf/service_conf.py => debian/service_conf.py
2010-07-31 15:50:12 -03:00
Ismael Carnales
e145ec686c
Replaced default spider manager (TwistedPluginSpiderManger) with a simpler one that doesn't depend on Twisted Plugins infrastructure.
2010-07-30 17:30:32 -03:00
Pablo Hoffman
65c3de8e6a
Rewritten walk_modules function to support eggs, and added tests
2010-07-30 17:05:55 -03:00
Ismael Carnales
6d06824488
utils: Add walk_packages utility function
...
--HG--
rename : scrapy/tests/test_utils_misc.py => scrapy/tests/test_utils_misc/__init__.py
2010-07-30 15:53:24 -03:00
Ismael Carnales
9511a3154e
Fix error message when spider not found in parse command
2010-07-30 14:49:07 -03:00
Pablo Hoffman
e112def754
removed old untested module: scrapy.utils.mysql
2010-07-29 12:11:07 -03:00
Pablo Hoffman
6312826220
removed (no longer supported) webconsole code
2010-07-29 12:03:02 -03:00
molveyra
936ffe5e26
Automated merge with ssh://hg@hg.scrapy.org:2222/scrapy
2010-07-28 11:28:52 -03:00
molveyra
8781ef3914
Remove restriction of marking ignore-beneath only for img unpaired tags
2010-07-28 11:16:42 -03:00
Pablo Hoffman
2349d241e0
removed custom Makefile and version based on mercurial revision
2010-07-24 17:02:08 -03:00
Pablo Hoffman
e2290a5359
Some changes to Crawl spider:
...
* added process_request attribute to rules
* removed docstrings, since it duplicates documentation
2010-07-22 18:40:35 -03:00
Daniel Grana
4e2859e5d5
Automated merge with ssh://hg.scrapy.org/scrapy-0.9
2010-07-20 15:47:46 -03:00
Daniel Grana
68c7ef7d98
fix scraper leak closing spider. closes #182
2010-07-20 15:47:07 -03:00
Daniel Grana
3e013f564b
update docs for defaultheaders middleware and change spider attribute to match global setting name
2010-07-16 16:17:08 -03:00
Daniel Grana
6883a99c1e
Automated merge with ssh://hg.scrapy.org/scrapy-0.9
2010-07-16 14:56:00 -03:00
Daniel Grana
b799e5ee37
Support default headers per spider. closes #181
...
--HG--
extra : rebase_source : 60162dffa4fbab525501e46b479dc272b8998942
2010-07-16 14:51:14 -03:00
Pablo Hoffman
b91d40ba78
Fixed grammar error in doc (patch by stav) - closes #176
2010-07-16 11:34:18 -03:00
Pablo Hoffman
b8aa74ee9e
bugfix in request_httprepr() function
2010-07-15 12:04:55 -03:00
Martin Olveyra
ec850b9fd1
Fix memusage report concatenation
2010-07-14 18:47:09 -03:00