1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-28 17:57:55 +00:00

409 Commits

Author SHA1 Message Date
Pablo Hoffman
6c921896a5 Expanded documentation on deploy command and versions. Refs #261 2010-10-19 00:11:45 -02:00
Pablo Hoffman
1d567cdce6 Added new 'deploy' command. Closes #261 2010-10-18 22:38:46 -02:00
Pablo Hoffman
7d8f922df9 Added documentation for CLOSESPIDER_ERRORCOUNT setting. Refs #254 2010-10-18 22:36:30 -02:00
Pablo Hoffman
c96f17c43d Automated merge with http://hg.scrapy.org/scrapy-0.10 2010-10-18 03:21:21 -02:00
Pablo Hoffman
98662e53ea Formatting fix in Scrapyd doc 2010-10-17 03:20:23 -02:00
Pablo Hoffman
a3d85da96f Automated merge with http://hg.scrapy.org/scrapy-0.10 2010-10-16 19:54:24 -02:00
Pablo Hoffman
5f65c26080 Some minor improvements to feature list in Scrapy at a Glance documentation page 2010-10-16 19:02:08 -02:00
Pablo Hoffman
d5c8caf07b Automated merge with http://hg.scrapy.org/scrapy-0.10 2010-10-10 20:31:38 -02:00
Pablo Hoffman
b4fbc6c5fa Updated Scrapy Tutorial to reference feed exports, instead a custom written pipeline, and extended item pipeline documentation to include a JSON writer. 2010-10-10 20:31:05 -02:00
Pablo Hoffman
aa4142e4ba Automated merge with http://hg.scrapy.org/scrapy-0.10 2010-10-07 18:23:48 -02:00
Pablo Hoffman
f4accb6c7f Updated dmoz xpaths of Scrapy tutorial 2010-10-07 18:22:01 -02:00
Pablo Hoffman
7826869cb2 Added missing colon 2010-09-28 16:44:53 -03:00
Martin Santos
0bf9e4627c added support to CloseSpider extension, for close the spider after N pages have been crawled. Using the CLOSESPIDER_PAGECOUNT setting. closes #253 2010-09-28 16:29:37 -03:00
Pablo Hoffman
279dcc245f Fixed role name in Sphinx doc 2010-09-26 01:01:06 -03:00
Pablo Hoffman
9599bde3e9 Removed RequestLimitMiddleware 2010-09-22 16:09:13 -03:00
Pablo Hoffman
ed4aec187f Ported code to use new unified access to spider settings, keeping backwards compatibility for old spider attributes. Refs #245 2010-09-22 16:09:13 -03:00
Pablo Hoffman
b6c2b55e5b Splitted settings classes from settings singleton. Closes #244
--HG--
rename : scrapy/conf/__init__.py => scrapy/conf.py
rename : scrapy/conf/default_settings.py => scrapy/settings/default_settings.py
rename : scrapy/tests/test_conf.py => scrapy/tests/test_settings.py
2010-09-22 15:47:33 -03:00
Pablo Hoffman
2ebfa7e68d Removed unneeded code (since autodoc is not used in Sphinx doc) 2010-09-22 10:52:02 -03:00
Shuaib
9288f622f9 Added formname parameter for FormRequest.from_response 2010-09-20 08:33:24 -03:00
Pablo Hoffman
bf467fc37a Check 'dont_merge_cookies' membership in request.meta, instead of getting its value 2010-09-10 15:29:15 -03:00
Pablo Hoffman
7d14a52234 Reference dont_merge_cookies in list of special Request.meta keys 2010-09-09 21:54:26 -03:00
Pablo Hoffman
7f21a6384f Documented handle_httpstatus_list request.meta key 2010-09-09 21:50:40 -03:00
Pablo Hoffman
f1c943543a Added dont_retry request.meta key to make RetryMiddleware ignore requests. Closes #234 2010-09-09 21:43:44 -03:00
Pablo Hoffman
9f01e3e79e Added dont_redirect request.meta key to make RedirectMiddleware ignore requests. Closes #233 2010-09-09 21:37:35 -03:00
Pablo Hoffman
7da79b90fe Make url/body attributes of Request/Response objects read-only - use replace() to change them. Deprecation warning left for backwards compatibilty. 2010-09-08 00:15:11 -03:00
Pablo Hoffman
c1aab2f58e Copy callback/errback attributes when copying Requests 2010-09-08 00:15:09 -03:00
Pablo Hoffman
e9ebebb230 Removed UrlFilterMiddleware from scrapy.contrib - see this snippet for an alternative: http://snippets.scrapy.org/snippets/12/ 2010-09-07 17:51:02 -03:00
Daniel Grana
12b04b068f make download_timeout configurable by request. closes #229
--HG--
extra : rebase_source : e57dfd4aeb98d48b04fc4d0c6469e9a85e4b33a8
2010-09-07 13:01:40 -03:00
Pablo Hoffman
9158e9d682 Some changes to Scrapyd to support multiple configuration files, to make it easier to deploy Scrapyd applications. Also documented 'egg_runner' and 'application' options
--HG--
rename : debian/scrapyd.cfg => debian/000-default
rename : scrapyd/default_scrapyd.cfg => scrapyd/default_scrapyd.conf
2010-09-07 09:17:25 -03:00
Daniel Grana
3414bf13ee remove request_uploaded signal and move response_received and response_downloaded to downloader manager. closes #228
--HG--
extra : rebase_source : 4af0d2a01b34de8a21048bb7f4a66bfc484b3b8f
2010-09-06 23:23:14 -03:00
Pablo Hoffman
3c5ab10688 Added FAQ entry about __VIEWSTATE parameter 2010-09-06 13:17:08 -03:00
Pablo Hoffman
e3d67d74f7 docs/intro/overview.rst: add example of scraped data and introduce loaders 2010-09-06 10:04:00 -03:00
Pablo Hoffman
00d55fbbd1 Updated 'Scrapy at a glance' document replacing item pipeline example by a simpler usage of feed exports 2010-09-05 23:38:37 -03:00
Pablo Hoffman
766f2d910d Renamed Request Handlers to Download Handlers 2010-09-05 19:35:53 -03:00
Pablo Hoffman
a5cf71cb06 Updated Ubuntu package signing key location 2010-09-05 19:04:15 -03:00
Pablo Hoffman
6bf52fb50e Make telnet console and web service try a range of ports for binding, instead of just one. Closes #226 2010-09-05 06:48:08 -03:00
Pablo Hoffman
14e985b076 Updated Command line tool documentation 2010-09-05 05:29:58 -03:00
Pablo Hoffman
1190f97944 Updated settings documentation 2010-09-05 04:58:14 -03:00
Pablo Hoffman
ebdb733e95 Updated some old messages in Scrapy shell doc 2010-09-05 04:45:43 -03:00
Pablo Hoffman
2f12618890 Post reference to Scrapyd in FAQ 2010-09-05 04:35:27 -03:00
Pablo Hoffman
bf34094e5a Added versionadded:: notice to new documentation topics 2010-09-04 03:30:45 -03:00
Daniel Grana
9f4b1e47a4 damn, really fix httpcache docs 2010-09-04 03:26:41 -03:00
Daniel Grana
7ad901640b fix httpcache docs 2010-09-04 03:23:08 -03:00
Daniel Grana
1abaa79469 Make ignored schemes configurable in HttpCacheMiddleware. closes #224
--HG--
extra : rebase_source : 2e6e8b93c642290f9bd6eb634eb4c8cd6da07c75
2010-09-04 02:58:43 -03:00
Pablo Hoffman
7b9fa7fbaa Don't filter out requests coming from spiders that don't define allowed_domains. Closes #225 2010-09-04 02:23:04 -03:00
Pablo Hoffman
37e9c5d78e Added new Scrapy service with support for:
* multiple projects
* uploading scrapy projects as Python eggs
* scheduling spiders using a JSON API

Documentation is added along with the code.

Closes #218.

--HG--
rename : debian/scrapy-service.default => debian/scrapyd.default
rename : debian/scrapy-service.dirs => debian/scrapyd.dirs
rename : debian/scrapy-service.install => debian/scrapyd.install
rename : debian/scrapy-service.lintian-overrides => debian/scrapyd.lintian-overrides
rename : debian/scrapy-service.postinst => debian/scrapyd.postinst
rename : debian/scrapy-service.postrm => debian/scrapyd.postrm
rename : debian/scrapy-service.upstart => debian/scrapyd.upstart
rename : extras/scrapy.tac => extras/scrapyd.tac
2010-09-03 15:54:42 -03:00
Pablo Hoffman
758d21b2f9 Simplified images pipeline by allowing it to be used without having to override it in your project. Closes #217 2010-08-31 16:03:08 -03:00
Pablo Hoffman
e7b3247a18 Updated some missing references to scrapy-ws script 2010-08-27 01:05:59 -03:00
Pablo Hoffman
e2ed27e4fd Added documentation for Ubuntu packages. Refs #211 2010-08-23 21:28:32 -03:00
Pablo Hoffman
58b4cc2c32 Some minor fixes to contribution Contributing documentation 2010-08-23 00:23:14 -03:00