1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-28 21:37:42 +00:00

2404 Commits

Author SHA1 Message Date
Pablo Hoffman
192ec5e35b Error logging improvements 2010-08-12 00:33:40 -03:00
Pablo Hoffman
6d21f2f23a removed reference to old middleware 2010-08-10 18:27:58 -03:00
Pablo Hoffman
68ae1da0d0 removed unused import 2010-08-10 18:26:40 -03:00
Pablo Hoffman
6a0ef0e41c removed unused import 2010-08-10 18:22:05 -03:00
Pablo Hoffman
0f342417b9 remove old unsupported item sampler middleware 2010-08-10 18:15:28 -03:00
Pablo Hoffman
2822db730f added new MiddlewareManager class that will be used as base class for pipeline and middlewares 2010-08-10 18:03:47 -03:00
Pablo Hoffman
9d38a99aa8 updated missing doc reference from previous commit 2010-08-10 17:47:04 -03:00
Pablo Hoffman
dedd6b2256 added more information to deprecation notice 2010-08-10 17:42:05 -03:00
Pablo Hoffman
784722774b moved scrapy.core.signals to scrapy.signals, keeping backwards compatibility 2010-08-10 17:40:53 -03:00
Pablo Hoffman
c359a34d7d moved scrapy.core.exceptions to scrapy.exceptions, keeping backwards compatibility
--HG--
rename : scrapy/core/exceptions.py => scrapy/exceptions.py
2010-08-10 17:36:48 -03:00
Pablo Hoffman
355c615c2c removed old unsupported SpiderProfiler extension 2010-08-10 17:23:22 -03:00
Pablo Hoffman
b1c0280616 removed scheduler middleware doc, as scheduler middleware will be removed soon 2010-08-10 16:59:49 -03:00
Pablo Hoffman
5aa8e63957 Removed 'sender' argument when sending signals, as we're not sending it consistently, and it's not being used by receivers either 2010-08-09 14:41:54 -03:00
Pablo Hoffman
3d121898e2 add signal handler name when logging errors 2010-08-09 13:32:44 -03:00
Pablo Hoffman
a08e62a25d silence irrelevant (and confusing) errors generated in tests by signals left active after engine tests run - we should really rewrite engine tests asap 2010-08-09 13:24:22 -03:00
Pablo Hoffman
bb2e0de7da fixed utils.signal tests broken in previous commit 2010-08-09 13:22:57 -03:00
Pablo Hoffman
8de0dc3647 Log full traceback of signal handler errors in send_catch_log() - closes #194. Also made engine use send_catch_log for spider_idle signal 2010-08-09 12:05:03 -03:00
Pablo Hoffman
2aa84073ab moved scrapy log observer logic into a separate function 2010-08-09 11:09:07 -03:00
Pablo Hoffman
5139843c5c avoid noisy KeyError in enqueue_scrape, when closing spiders manually 2010-08-09 11:07:45 -03:00
Pablo Hoffman
67e42d1bfd Moved scrapy/command/__init__.py to scrapy/command.py
--HG--
rename : scrapy/command/__init__.py => scrapy/command.py
2010-08-08 07:30:03 -03:00
Pablo Hoffman
c7d9f6e270 Added JSON item exporter with doc and unittests (closes #192), and also:
* put all json exporters in scrapy.contrib.exporters and deprecated
  scrapy.contrib.exporters.jsonlines to reduce module nesting
* use JSON exporter with EXPORT_FORMAT=json in file export pipeline
2010-08-07 15:52:59 -03:00
Pablo Hoffman
ba2369bfa1 changed variable names for clarity 2010-08-06 15:05:58 -03:00
Pablo Hoffman
35e6c8725b Added handles_request() class method to BaseSpider - closes #191 2010-08-06 14:59:18 -03:00
Pablo Hoffman
4d66f4c6f8 Added support for logging twisted errors generated outside of Scrapy - refs #188 2010-08-05 20:46:54 -03:00
Pablo Hoffman
c82260f5ec make runtests.sh more virtualenv-friendly 2010-08-05 13:31:19 -03:00
Pablo Hoffman
daa9506b34 Fixed bug with non-keepalive execution queues (closes #190) 2010-08-04 14:20:45 -03:00
Pablo Hoffman
49851d7f55 Automated merge with http://hg.scrapy.org/scrapy-0.9 2010-08-02 17:20:55 -03:00
Pablo Hoffman
6c68e4ce15 fixed documentation typo 2010-08-02 17:20:13 -03:00
Pablo Hoffman
b2878fbbef scrapy.utils.url_is_from_spider() - Consider spider name as possible domain for matching in scrapy.utils.url_is_from_spider() 2010-08-02 12:16:36 -03:00
Pablo Hoffman
453e7bf38c Scrapy logging refactoring (closes #188):
* added Twisted log observer for Scrapy, with unittests
 * use numeric values from Python logging module for log levels
 * removed scrapy.log.exc() function - use scrapy.log.err() instead
 * removed logmessage_received signal - write a (twisted) log observer instead
 * dropped support for obsolete `domain` argument
 * dropped support for old setting names: LOGLEVEL, LOGFILE (replaced by LOG_LEVEL, LOG_FILE)
 * deprecated `component` argument
2010-08-02 08:49:14 -03:00
Pablo Hoffman
c5fd113c09 minor fix to path name 2010-07-31 16:02:59 -03:00
Pablo Hoffman
f6eac8e348 Splitted single Debian package into two packages (closes #187):
- scrapy: which provides only the library and scrapy-ctl command
- scrapy-service: which provides the service, upstart script, system user, etc

This allows a clean install of just the library for those which are not
interested in the Scrapy service.

--HG--
rename : debian/scrapy.dirs => debian/scrapy-service.dirs
rename : debian/scrapy.install => debian/scrapy-service.install
rename : debian/scrapy.postinst => debian/scrapy-service.postinst
rename : debian/scrapy.postrm => debian/scrapy-service.postrm
rename : debian/scrapy.upstart => debian/scrapy-service.upstart
rename : debian/conf/service_conf.py => debian/service_conf.py
2010-07-31 15:50:12 -03:00
Ismael Carnales
e145ec686c Replaced default spider manager (TwistedPluginSpiderManger) with a simpler one that doesn't depend on Twisted Plugins infrastructure. 2010-07-30 17:30:32 -03:00
Pablo Hoffman
65c3de8e6a Rewritten walk_modules function to support eggs, and added tests 2010-07-30 17:05:55 -03:00
Ismael Carnales
6d06824488 utils: Add walk_packages utility function
--HG--
rename : scrapy/tests/test_utils_misc.py => scrapy/tests/test_utils_misc/__init__.py
2010-07-30 15:53:24 -03:00
Ismael Carnales
9511a3154e Fix error message when spider not found in parse command 2010-07-30 14:49:07 -03:00
Pablo Hoffman
e112def754 removed old untested module: scrapy.utils.mysql 2010-07-29 12:11:07 -03:00
Pablo Hoffman
6312826220 removed (no longer supported) webconsole code 2010-07-29 12:03:02 -03:00
molveyra
936ffe5e26 Automated merge with ssh://hg@hg.scrapy.org:2222/scrapy 2010-07-28 11:28:52 -03:00
molveyra
8781ef3914 Remove restriction of marking ignore-beneath only for img unpaired tags 2010-07-28 11:16:42 -03:00
Pablo Hoffman
2349d241e0 removed custom Makefile and version based on mercurial revision 2010-07-24 17:02:08 -03:00
Pablo Hoffman
e2290a5359 Some changes to Crawl spider:
* added process_request attribute to rules
* removed docstrings, since it duplicates documentation
2010-07-22 18:40:35 -03:00
Daniel Grana
4e2859e5d5 Automated merge with ssh://hg.scrapy.org/scrapy-0.9 2010-07-20 15:47:46 -03:00
Daniel Grana
68c7ef7d98 fix scraper leak closing spider. closes #182 2010-07-20 15:47:07 -03:00
Daniel Grana
3e013f564b update docs for defaultheaders middleware and change spider attribute to match global setting name 2010-07-16 16:17:08 -03:00
Daniel Grana
6883a99c1e Automated merge with ssh://hg.scrapy.org/scrapy-0.9 2010-07-16 14:56:00 -03:00
Daniel Grana
b799e5ee37 Support default headers per spider. closes #181
--HG--
extra : rebase_source : 60162dffa4fbab525501e46b479dc272b8998942
2010-07-16 14:51:14 -03:00
Pablo Hoffman
b91d40ba78 Fixed grammar error in doc (patch by stav) - closes #176 2010-07-16 11:34:18 -03:00
Pablo Hoffman
b8aa74ee9e bugfix in request_httprepr() function 2010-07-15 12:04:55 -03:00
Martin Olveyra
ec850b9fd1 Fix memusage report concatenation 2010-07-14 18:47:09 -03:00