1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-25 21:43:53 +00:00

831 Commits

Author SHA1 Message Date
Pablo Hoffman
c380910b40 Merge pull request #167 from alexcepoi/sep-017
Spider contracts (SEP-017)
2012-09-28 13:57:07 -07:00
Alex Cepoi
11d29c7005 SEP-017 contracts: add tests and minor improvements 2012-09-21 00:12:46 +02:00
Pablo Hoffman
b46b5a6ef0 Documented AutoThrottle extension and added to extensions available by
default. Also deprecated concurrency and delay settings, in favour of
using the standard Scrapy ones.
2012-09-20 18:52:57 -03:00
Pablo Hoffman
c7f8219901 - removed scrapy.conf singleton from scrapy.log, scrapy.responsetypes,
scrapy.http.response.text, scrapy.selector
- fixed bug with scrapy.conf.settings backwards compatibility support
- added facility to notify (and provide some guidelines) about deprecated/obsolete settings
2012-09-19 03:03:34 -03:00
stav
303e13f616 selector documentation typos 2012-09-18 12:56:52 -05:00
Pablo Hoffman
3d736e657f fixed typo in doc 2012-09-18 10:51:01 -03:00
Pablo Hoffman
eed6eb49da make DBM the new default storage backend for HTTP cache middleware, simplified DBM storage backend code to avoid dealing with many spiders at once (not needed), and update httpcache stats names (hit -> hits, miss -> misses) 2012-09-17 10:11:07 -03:00
Pablo Hoffman
8f2dda12cc removed another instance of scrapy.conf.settings singleton, this time from scrapy.utils.trackref. From now on, trackrefs functionality will be always enabled as it imposes a very minimal performance overhead 2012-09-16 21:21:44 -03:00
Pablo Hoffman
81ed2d2d0b major Stats Collection refactoring: removed separation of global/per-spider stats, removed stats-related signals (stats_spider_opened, etc). Stats are much simpler now, backwards compatibility is kept on the Stats Collector API. 2012-09-14 12:31:33 -03:00
Pablo Hoffman
a874964ad4 renamed 'XPath Selectors' title to just 'Selectors' 2012-09-13 15:24:44 -03:00
Pablo Hoffman
acb8895e1a changed note in scrapyd doc to use sphinx notes 2012-09-13 15:22:59 -03:00
Pablo Hoffman
26f1d5cb48 Merge pull request #171 from artem-dev/scrapyd_job_times
added start and stop times for scrapyd list jobs web service
2012-09-13 11:16:01 -07:00
Artem Bogomyagkov
4d8f253912 commited doc file missed from prev commit 2012-09-12 11:12:59 +03:00
Pablo Hoffman
7ef593c5c2 refactored MailSender to get rid of scrapy.conf singleton, also removed ill-designed scrapy.mail.mail_sent signal 2012-09-11 16:27:19 -03:00
Alex Cepoi
bf8dc61fb7 SEP-017 contracts: pretty-printing and docs 2012-09-10 23:17:27 +02:00
Pablo Hoffman
fff2871828 added doc section (and FAQ) about spider arguments 2012-09-04 14:49:30 -03:00
Pablo Hoffman
e2f9daac67 fixed formatting in scrapyd release notes 2012-09-03 16:58:58 -03:00
Pablo Hoffman
be206ca5ab added process_start_requests method to spider middlewares 2012-08-31 16:41:50 -03:00
Pablo Hoffman
b920674666 corrected minor issue with doc references 2012-08-31 16:41:49 -03:00
Pablo Hoffman
4ec99117d3 fixed minor doc typo 2012-08-30 11:56:30 -03:00
Pablo Hoffman
70f8e517a1 promoted DjangoItem to main contrib 2012-08-29 11:23:11 -03:00
Pablo Hoffman
babfc6e79b Updated documentation after singleton removal changes.
Also removed some unused code and made some minor additional
refactoring.
2012-08-28 18:35:57 -03:00
Pablo Hoffman
36f47a4aec Removed per-spider settings concept, and scrapy.conf.settings singleton from many extensions and middlewares. There are some still remaining, that will be removed in future commits 2012-08-21 17:27:45 -03:00
Daniel Graña
b7b0c49520 append parse command to example code sections in docs. closes #162 2012-08-06 09:10:16 -03:00
Pablo Hoffman
832e45073b fixed typo in stats documentation. closes #159 2012-07-20 17:13:06 -03:00
Daniel Graña
277ed0ae23 Merge pull request #145 from alexcepoi/cookies-changes
domain and path support for request cookies
2012-06-25 11:29:04 -07:00
Alexandru Cepoi
177c81745d domain and path support for request cookies 2012-06-25 20:17:59 +02:00
Pablo Hoffman
179e3810dc fixed links to doc. closes #150 2012-06-24 01:00:33 -03:00
Alexandru Cepoi
f4faa19e31 added docs topic debugging spiders 2012-06-21 20:03:33 +02:00
Alexandru Cepoi
3e05a2ecf6 update docs for parse command 2012-06-12 18:28:10 +02:00
Pablo Hoffman
9686f97242 added precise to supported ubuntu distros 2012-05-12 19:54:36 -03:00
Pablo Hoffman
58e88ed246 scrapyd: do not set SCRAPY_FEED_URI/SCRAPY_LOG_FILE if items_dir/logs_dir settings are not set 2012-05-08 17:43:00 -03:00
Pablo Hoffman
9c3b9f2968 fixed bug in json-rpc webservice reported in https://groups.google.com/d/topic/scrapy-users/qgVBmFybNAQ/discussion. also removed no longer supported 'run' command from extras/scrapy-ws.py 2012-05-03 12:05:40 -03:00
Pablo Hoffman
abcac4fcbd updated maintainer to scrapinghub 2012-05-02 03:25:35 -03:00
stav
86dba76d1f documentation indentation 2012-04-30 13:09:34 -05:00
Pablo Hoffman
d567d8efbe added note to docs/topics/firebug.rst about google directory being shut down 2012-04-19 01:34:20 -03:00
stav
f1802289cd small doc typo change to get the fork rolling 2012-04-11 12:05:39 -05:00
Pablo Hoffman
27018fced7 changed default user agent to Scrapy/0.15 (+http://scrapy.org) and removed no longer needed BOT_VERSION setting 2012-03-23 13:45:21 -03:00
Pablo Hoffman
8933e2f2be added REFERER_ENABLED setting, to control referer middleware 2012-03-22 16:35:14 -03:00
Jason Yeo
da826aa13d fixed minor mistake in Request objects documentation 2012-03-21 10:25:41 +08:00
Pablo Hoffman
175c70ad44 fixed minor defect in link extractors documentation 2012-03-20 22:56:45 -03:00
Pablo Hoffman
35fb01156e removed some obsolete remaining code related to sqlite support in scrapy 2012-03-16 11:55:55 -03:00
Pablo Hoffman
2b16ebdc11 added minor clarification on cookiejar request meta key usage 2012-02-29 07:19:01 -02:00
lostsnow
5afe4f50c1 scrapyd: support bind to a specific ip address 2012-02-29 13:47:40 +08:00
Pablo Hoffman
81abb45000 fixed bug in new cookiejar documentation 2012-02-28 11:08:25 -02:00
Pablo Hoffman
26c8004125 added documentation for the new cookiejar Request.meta key 2012-02-27 19:58:58 -02:00
Pablo Hoffman
7fe7c3f3b1 MemoryUsage extension: close the spiders (instead of stopping the engine) when the limit is exceeded, providing a descriptive reason for the close. Also fixed default value of MEMUSAGE_ENABLED setting to match the documentation. 2012-02-23 17:05:06 -02:00
Pablo Hoffman
7b8942a648 updated StackTraceDump extension doc 2012-02-16 15:14:17 -02:00
Pablo Hoffman
0b0bce7f3c scrapyd: added cancel.json and listjobs.json api methods to documentation 2012-01-05 11:23:25 -02:00
Pablo Hoffman
8f42633a94 scrapyd: added clarification about how to disable items feeds generation 2012-01-05 11:20:50 -02:00