Pablo Hoffman
c359a34d7d
moved scrapy.core.exceptions to scrapy.exceptions, keeping backwards compatibility
...
--HG--
rename : scrapy/core/exceptions.py => scrapy/exceptions.py
2010-08-10 17:36:48 -03:00
Pablo Hoffman
355c615c2c
removed old unsupported SpiderProfiler extension
2010-08-10 17:23:22 -03:00
Pablo Hoffman
b1c0280616
removed scheduler middleware doc, as scheduler middleware will be removed soon
2010-08-10 16:59:49 -03:00
Pablo Hoffman
5aa8e63957
Removed 'sender' argument when sending signals, as we're not sending it consistently, and it's not being used by receivers either
2010-08-09 14:41:54 -03:00
Pablo Hoffman
3d121898e2
add signal handler name when logging errors
2010-08-09 13:32:44 -03:00
Pablo Hoffman
a08e62a25d
silence irrelevant (and confusing) errors generated in tests by signals left active after engine tests run - we should really rewrite engine tests asap
2010-08-09 13:24:22 -03:00
Pablo Hoffman
bb2e0de7da
fixed utils.signal tests broken in previous commit
2010-08-09 13:22:57 -03:00
Pablo Hoffman
8de0dc3647
Log full traceback of signal handler errors in send_catch_log() - closes #194 . Also made engine use send_catch_log for spider_idle signal
2010-08-09 12:05:03 -03:00
Pablo Hoffman
2aa84073ab
moved scrapy log observer logic into a separate function
2010-08-09 11:09:07 -03:00
Pablo Hoffman
5139843c5c
avoid noisy KeyError in enqueue_scrape, when closing spiders manually
2010-08-09 11:07:45 -03:00
Pablo Hoffman
67e42d1bfd
Moved scrapy/command/__init__.py to scrapy/command.py
...
--HG--
rename : scrapy/command/__init__.py => scrapy/command.py
2010-08-08 07:30:03 -03:00
Pablo Hoffman
c7d9f6e270
Added JSON item exporter with doc and unittests ( closes #192 ), and also:
...
* put all json exporters in scrapy.contrib.exporters and deprecated
scrapy.contrib.exporters.jsonlines to reduce module nesting
* use JSON exporter with EXPORT_FORMAT=json in file export pipeline
2010-08-07 15:52:59 -03:00
Pablo Hoffman
ba2369bfa1
changed variable names for clarity
2010-08-06 15:05:58 -03:00
Pablo Hoffman
35e6c8725b
Added handles_request() class method to BaseSpider - closes #191
2010-08-06 14:59:18 -03:00
Pablo Hoffman
4d66f4c6f8
Added support for logging twisted errors generated outside of Scrapy - refs #188
2010-08-05 20:46:54 -03:00
Pablo Hoffman
c82260f5ec
make runtests.sh more virtualenv-friendly
2010-08-05 13:31:19 -03:00
Pablo Hoffman
daa9506b34
Fixed bug with non-keepalive execution queues ( closes #190 )
2010-08-04 14:20:45 -03:00
Pablo Hoffman
49851d7f55
Automated merge with http://hg.scrapy.org/scrapy-0.9
2010-08-02 17:20:55 -03:00
Pablo Hoffman
6c68e4ce15
fixed documentation typo
2010-08-02 17:20:13 -03:00
Pablo Hoffman
b2878fbbef
scrapy.utils.url_is_from_spider() - Consider spider name as possible domain for matching in scrapy.utils.url_is_from_spider()
2010-08-02 12:16:36 -03:00
Pablo Hoffman
453e7bf38c
Scrapy logging refactoring ( closes #188 ):
...
* added Twisted log observer for Scrapy, with unittests
* use numeric values from Python logging module for log levels
* removed scrapy.log.exc() function - use scrapy.log.err() instead
* removed logmessage_received signal - write a (twisted) log observer instead
* dropped support for obsolete `domain` argument
* dropped support for old setting names: LOGLEVEL, LOGFILE (replaced by LOG_LEVEL, LOG_FILE)
* deprecated `component` argument
2010-08-02 08:49:14 -03:00
Pablo Hoffman
c5fd113c09
minor fix to path name
2010-07-31 16:02:59 -03:00
Pablo Hoffman
f6eac8e348
Splitted single Debian package into two packages ( closes #187 ):
...
- scrapy: which provides only the library and scrapy-ctl command
- scrapy-service: which provides the service, upstart script, system user, etc
This allows a clean install of just the library for those which are not
interested in the Scrapy service.
--HG--
rename : debian/scrapy.dirs => debian/scrapy-service.dirs
rename : debian/scrapy.install => debian/scrapy-service.install
rename : debian/scrapy.postinst => debian/scrapy-service.postinst
rename : debian/scrapy.postrm => debian/scrapy-service.postrm
rename : debian/scrapy.upstart => debian/scrapy-service.upstart
rename : debian/conf/service_conf.py => debian/service_conf.py
2010-07-31 15:50:12 -03:00
Ismael Carnales
e145ec686c
Replaced default spider manager (TwistedPluginSpiderManger) with a simpler one that doesn't depend on Twisted Plugins infrastructure.
2010-07-30 17:30:32 -03:00
Pablo Hoffman
65c3de8e6a
Rewritten walk_modules function to support eggs, and added tests
2010-07-30 17:05:55 -03:00
Ismael Carnales
6d06824488
utils: Add walk_packages utility function
...
--HG--
rename : scrapy/tests/test_utils_misc.py => scrapy/tests/test_utils_misc/__init__.py
2010-07-30 15:53:24 -03:00
Ismael Carnales
9511a3154e
Fix error message when spider not found in parse command
2010-07-30 14:49:07 -03:00
Pablo Hoffman
e112def754
removed old untested module: scrapy.utils.mysql
2010-07-29 12:11:07 -03:00
Pablo Hoffman
6312826220
removed (no longer supported) webconsole code
2010-07-29 12:03:02 -03:00
molveyra
936ffe5e26
Automated merge with ssh://hg@hg.scrapy.org:2222/scrapy
2010-07-28 11:28:52 -03:00
molveyra
8781ef3914
Remove restriction of marking ignore-beneath only for img unpaired tags
2010-07-28 11:16:42 -03:00
Pablo Hoffman
2349d241e0
removed custom Makefile and version based on mercurial revision
2010-07-24 17:02:08 -03:00
Pablo Hoffman
e2290a5359
Some changes to Crawl spider:
...
* added process_request attribute to rules
* removed docstrings, since it duplicates documentation
2010-07-22 18:40:35 -03:00
Daniel Grana
4e2859e5d5
Automated merge with ssh://hg.scrapy.org/scrapy-0.9
2010-07-20 15:47:46 -03:00
Daniel Grana
68c7ef7d98
fix scraper leak closing spider. closes #182
2010-07-20 15:47:07 -03:00
Daniel Grana
3e013f564b
update docs for defaultheaders middleware and change spider attribute to match global setting name
2010-07-16 16:17:08 -03:00
Daniel Grana
6883a99c1e
Automated merge with ssh://hg.scrapy.org/scrapy-0.9
2010-07-16 14:56:00 -03:00
Daniel Grana
b799e5ee37
Support default headers per spider. closes #181
...
--HG--
extra : rebase_source : 60162dffa4fbab525501e46b479dc272b8998942
2010-07-16 14:51:14 -03:00
Pablo Hoffman
b91d40ba78
Fixed grammar error in doc (patch by stav) - closes #176
2010-07-16 11:34:18 -03:00
Pablo Hoffman
b8aa74ee9e
bugfix in request_httprepr() function
2010-07-15 12:04:55 -03:00
Martin Olveyra
ec850b9fd1
Fix memusage report concatenation
2010-07-14 18:47:09 -03:00
Pablo Hoffman
90a04f0530
Automated merge with http://hg.scrapy.org/scrapy-0.9
2010-07-13 19:47:55 -03:00
Pablo Hoffman
cc32f6ec66
Applied patch to ClientForm to fix bug with wrong entities. Also added tests and left patch in repo in case we upgrade ClientForm in the future and need to re-apply it
2010-07-13 19:46:53 -03:00
Pablo Hoffman
9e37ec4230
fixed documentation typo ( closes #151 )
2010-07-13 19:03:02 -03:00
Ping Yin
b3a65d3313
HTTPCACHE: Don't cache response with codes in HTTPCACHE_IGNORE_HTTP_CODES
2010-07-09 13:14:25 -03:00
Pablo Hoffman
2067bcd8d0
Automated merge with http://hg.scrapy.org/scrapy-0.9
2010-07-08 14:03:58 -03:00
Juan Picca
2ddbbc8152
allow passing custom headers in FormRequest.from_response()
2010-07-08 14:02:28 -03:00
Pablo Hoffman
a6a86d9b4a
Automated merge with http://hg.scrapy.org/scrapy-0.9
2010-07-01 11:48:36 -03:00
Martin Olveyra
b258fc3305
Fixed bug with float values in meta refresh
2010-07-01 11:46:06 -03:00
Pablo Hoffman
3e976e5005
Automated merge with http://hg.scrapy.org/scrapy-0.9
2010-06-28 00:55:35 -03:00