1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-27 04:24:05 +00:00

2190 Commits

Author SHA1 Message Date
Pablo Hoffman
3012030b2f Fixed bug in file:// downloader handler with uris containing percent-escaped chars 2009-12-12 10:57:17 -02:00
Pablo Hoffman
cca4be4d64 uncommented some lines in extras/build_release.sh 2009-12-11 23:52:53 -02:00
Ismael Carnales
4ecc909bc1 Fix RobotsTxtMiddleware reference in doc 2009-12-04 15:37:24 -02:00
Pablo Hoffman
fe88aad27d removed some old code from scrapy.spider.models 2009-12-02 23:30:00 -02:00
Pablo Hoffman
41894c776e adjusted some core code and comments for consistency after the move to spider references 2009-12-02 23:06:17 -02:00
Pablo Hoffman
57bae5335f Added Settings class tests, and fixed minor bug 2009-12-02 15:52:17 -02:00
Pablo Hoffman
dbe0a6875c Decoupled Settings class from environment settings singleton (scrapy.conf.settings) 2009-12-02 15:11:11 -02:00
Pablo Hoffman
925eecce70 fixed bug in OpenSSL depedency test which was comparing versions alphabetically, instead of numerically 2009-12-02 13:59:58 -02:00
Pablo Hoffman
dda7e01d2f don't fail if OpenSSL is not installed in test_dependencies.py 2009-12-02 11:26:34 -02:00
Ismael Carnales
07344666e2 Move webconsole extensions doc to webconsole topic 2009-12-01 10:47:11 -02:00
Ismael Carnales
e694c8ed02 Remove domain references in close spider extension doc 2009-11-30 11:38:56 -02:00
Ismael Carnales
12a7ff7312 Rename Close domain to close spider in extensions doc 2009-11-30 11:36:18 -02:00
Ismael Carnales
8d9cedd88b Reorder signals doc to respect alphabetical order 2009-11-30 11:29:19 -02:00
Ismael Carnales
93cc3d2715 Correct param formatting in item pipelines doc 2009-11-30 11:04:15 -02:00
Pablo Hoffman
cabee59d4d Fixed compatibility issues with Twisted 9 (closes #128) 2009-11-28 22:15:09 -02:00
Pablo Hoffman
6084be3b2e added iter_all() function to scrapy.util.trackref module and improved memory leaks documentation. also added a new FAQ antry about memory issues 2009-11-28 16:21:59 -02:00
Pablo Hoffman
de35eee3cd Automated merge after applying ajones patch 2009-11-28 12:31:05 -02:00
Pablo Hoffman
207aae2b04 Fixed logging of "Spider closed" message in engine 2009-11-26 19:12:33 -02:00
Pablo Hoffman
34fcf6ba5b Added informative message when trying to use trackref and it's not enabled 2009-11-26 18:37:45 -02:00
Pablo Hoffman
fd50891113 Keep track of pending next-request calls in the engine and downloader, to cancel them properly when the spider is closed. This also avoids keeping spider references alive, after they are closed. 2009-11-26 17:00:16 -02:00
Pablo Hoffman
c0f1c8de04 use DelayedCall.active() instead DelayedCall.called, to support cancelled tasks 2009-11-26 16:10:28 -02:00
Pablo Hoffman
88417a3ed1 Fixed bug in LiveStats webconsole module which was keeping references to spiders alive, after they were closed 2009-11-25 22:51:58 -02:00
Pablo Hoffman
a2854e3948 Added hack to speed up processing of IgnoreRequest errors (#125) 2009-11-25 22:28:59 -02:00
Pablo Hoffman
a48516b105 removed obsolete remove_escape_chars function - use replace_escape_chars instead 2009-11-25 22:09:47 -02:00
ajones1@gmail.com
0dcf47a706 fix name errors on robots.txt middleware during spider_close 2009-11-25 12:59:28 -08:00
Pablo Hoffman
a36909a3f9 Added BaseSpider objects to trackref 2009-11-25 16:59:47 -02:00
Pablo Hoffman
9311741236 Fixed memory leak in CachingResolver 2009-11-25 16:54:36 -02:00
Pablo Hoffman
1cfd959828 remove wrong support for returning Responses in scheduler middlewares 2009-11-25 11:20:43 -02:00
Pablo Hoffman
1a81ab6f07 renamed extension: DelayedCloseDomain to SpiderCloseDelay
--HG--
rename : scrapy/contrib/delayedclosedomain.py => scrapy/contrib/spiderclosedelay.py
2009-11-21 15:17:38 -02:00
Pablo Hoffman
326090d4a4 Reordered setting to preseve alphabetical order 2009-11-21 15:16:29 -02:00
Pablo Hoffman
a49aef2beb Renamed exception: DontCloseDomain to DontCloseSpider (closes #120) 2009-11-21 15:06:03 -02:00
Pablo Hoffman
0d75a3a636 Automated merge with http://hg.scrapy.org/scrapy-stable/ 2009-11-20 10:38:51 -02:00
Ismael Carnales
f86f62c5d9 Use setuptools for install, if not present fallback to distutils 2009-11-20 09:30:06 -02:00
Pablo Hoffman
dd662e09d8 some minor fixes to scheduler middleware doc 2009-11-19 12:23:54 -02:00
Pablo Hoffman
8c14abadbb Automated merge with http://hg.scrapy.org/scrapy-stable/ 2009-11-19 11:23:50 -02:00
Pablo Hoffman
9ff4dbf636 renamed file missing from previous commit
--HG--
rename : scrapy/xlib/patches.py => scrapy/xlib/twisted_250_monkeypatches.py
2009-11-19 11:23:36 -02:00
Pablo Hoffman
f4e93700bd Automated merge with http://hg.scrapy.org/scrapy-stable/ 2009-11-19 10:44:02 -02:00
Pablo Hoffman
bf55a4708f prevent 'import scrapy' from failing when twisted module is not available, also moved twisted 2.5.0 monkeypatch into a more specific module name 2009-11-19 10:41:36 -02:00
Pablo Hoffman
c4f77c4da0 minor fixes to images doc (thanks amccloud) 2009-11-16 11:15:25 -02:00
Pablo Hoffman
0d6aee1f12 updated wrong documentation 2009-11-13 20:03:56 -02:00
Daniel Grana
445d8cd9f0 delay next_request check after stoping a spider close to avoid 100% cpu usage loops in some cases 2009-11-06 22:02:12 -02:00
Pablo Hoffman
aeab5370cb StatsCollector: ported methods to receive spider instances (closes #113), removed list_domains() method, added iter_spider_stats() method 2009-11-14 20:28:59 -02:00
Pablo Hoffman
c4c6e7c8cd Automated merge with http://hg.scrapy.org/scrapy-stable/ 2009-11-13 20:04:39 -02:00
Pablo Hoffman
505dfe6f7e fixed exception when running scrapy shell on a non-textual response (fixes #116) 2009-11-13 17:21:59 -02:00
Pablo Hoffman
f3e861c89e replaced old reference to domain instead of spider 2009-11-13 14:44:03 -02:00
Pablo Hoffman
07655d05ea renamed REQUESTS_PER_SPIDER setting to CONCURRENT_REQUESTS_PER_SPIDER 2009-11-13 14:38:22 -02:00
Pablo Hoffman
564abd10ad Refactored HttpCache middleware:
* simplified code
* performance improvements
* removed awkward/unused domain sectorization
* it can now receive Settings on constructor
* added unittests
* added documentation about filesystem storage structure

Also made scrapy.conf.Settings objects instantiable with a dict which is used to override default settings.
2009-11-13 14:25:47 -02:00
Pablo Hoffman
db7fec1fef fixed doc typo 2009-11-12 12:17:39 -02:00
Pablo Hoffman
415dec4e16 made offsite middleware log messages when filtering out requests 2009-11-12 10:17:21 -02:00
Pablo Hoffman
ee08d38ab6 removed deprecated SCRAPYSETTINGS_MODULE environment variable 2009-11-06 16:56:17 -02:00