1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-24 19:24:12 +00:00

418 Commits

Author SHA1 Message Date
Pablo Hoffman
7400ceb1ed added 502 to RETRY_HTTP_CODES 2013-02-22 19:12:59 -02:00
Pablo Hoffman
a038f46859 doc: fixed rst title 2013-02-14 11:11:17 -02:00
Pablo Hoffman
22edc44c6c doc: remove links to diveintopython.org, which is no longer available. closes #246 2013-02-14 11:09:40 -02:00
Daniel Graña
5db45b3825 remove scrapyd, it was migrated to its own repository 2013-02-06 05:24:07 +00:00
whodatninja
8e3b5baac5 Fix typo labeling attrs type bool instead of list 2013-02-05 15:10:41 -05:00
Chris Tilden
aae6aed4fb fixes spelling errors in documentation 2013-01-22 14:52:18 -08:00
Pablo Hoffman
6ab8afb992 improve documentation about removing namespaces 2013-01-18 12:35:30 -02:00
Pablo Hoffman
1ba04b1fc3 added remove_namespaces() method to XmlXPathSelector objects 2013-01-18 12:20:03 -02:00
Pablo Hoffman
c31441a273 revert default HTTP cache policy to dummy (instead of RFC2616) 2013-01-17 13:08:29 -02:00
Daniel Graña
897195186a document new FormRequest parameter named formxpath that matches forms using xpath 2013-01-08 18:36:20 -02:00
Daniel Graña
75563b3f00 Add list of supported and missing RFC2616 caching features 2013-01-08 18:16:44 -02:00
Daniel Graña
d8a760bf57 Merge branch 'http-cache-middleware'
Conflicts:
	scrapy/contrib/downloadermiddleware/httpcache.py
	scrapy/contrib/httpcache.py
	scrapy/tests/test_downloadermiddleware_httpcache.py
2013-01-08 17:34:48 -02:00
Daniel Graña
864a7aef87 More httpcache updates
* Change default cache policy to RFC2616
* Update HttpCacheMiddleware documentation
* Move policies to scrapy.contrib.httpcache
* remove a lint error for .has_key() usage in DBM storage backend
2013-01-08 17:26:32 -02:00
Daniel Graña
defc4f89b5 update metarefresh settings 2013-01-08 11:41:19 -02:00
Daniel Graña
6a2b23883a Add MetaRefreshMiddleware docs 2013-01-08 11:25:38 -02:00
Daniel Graña
076ba40404 update DOWNLOADER_MIDDLEWARES_BASE setting documentation 2013-01-08 10:50:27 -02:00
Pablo Hoffman
227a1d666b add doc about disabling an extension. refs #132 2013-01-07 13:16:19 -02:00
Pedro Faustino
5d3a4d755f Update downloader middleware documentation 2013-01-06 18:53:14 +00:00
Emanuel Schorsch
f9b130da12 Proposed Changes
I was very confused as to how you actually import DjangoItem.
I searched extensively on the internet looking for actual code so I could see how it worked.
I finally found http://blog.just2us.com/2012/07/setting-up-django-with-scrapy/. It is much easier to understand with full files instead of code fragments.
I also edited where it says "we can see that the model is already saved" as I don't see how it's already saved.
2013-01-04 15:59:04 -05:00
Natan L
d572f8945e Fixed typo
'persitent' --> 'persistent'
2012-12-31 11:14:01 -08:00
Pedro Faustino
492831fc6f Merge branch 'master' of git://github.com/scrapy/scrapy into http-cache-middleware 2012-12-28 15:27:45 +01:00
Hasnain Lakhani
93a1102189 Implemented policies for HTTP Cache 2012-12-26 16:29:48 -08:00
Pablo Hoffman
51b8feb4ce fixed doc typos 2012-12-26 16:16:53 -02:00
Pablo Hoffman
1e2ee76df2 add documentation topics: Broad Crawls & Common Practies 2012-12-26 14:02:13 -02:00
Pedro Faustino
fdaa35f6e8 Updated the downloader middleware documentation to reflect changes introduced by the support for real HTTP caching. 2012-12-24 19:37:53 +01:00
Pablo Hoffman
12475fccbe Merge pull request #206 from dangra/downloader-enhancements
AutoThrottle and Downloader enhancements
2012-12-17 10:11:46 -08:00
Daniel Graña
d7daf836d5 Altering delay is enough to auto throttle downloads 2012-12-17 16:08:49 -02:00
Luan
5582ea28ec Update docs/topics/commands.rst
A short change.
2012-12-10 15:16:02 -02:00
Pablo Hoffman
39274a2457 doc: removed obsolete references to ClientForm 2012-11-23 19:06:47 -02:00
stav
99f164fc87 correct docs for default storage backend 2012-11-22 14:05:47 -06:00
Ilya Baryshev
097aea04a4 Fixed docs typo in SpiderOpenCloseLogging example 2012-11-10 12:24:53 +04:00
Pablo Hoffman
aa0e02dc54 added open_in_browser to debugging doc 2012-11-04 19:58:06 -02:00
Pablo Hoffman
7a7c5d1334 removed reference to global scrapy stats from settings doc 2012-11-03 17:05:01 -02:00
Pablo Hoffman
8f4c879b58 extended documentation on how to access crawler stats from extensions 2012-10-25 11:28:23 -02:00
Pablo Hoffman
1a905d62f5 removed scrapy.log.started attribute, and avoid checking if log has already been started (since it should be called once anyway) 2012-10-09 16:05:19 -02:00
Pablo Hoffman
1f89eb59fe fixed doc reference to topics-contracts 2012-10-09 16:02:12 -02:00
Pablo Hoffman
7458092eef added spider contracts to release notes and warn that its API is still subject to change 2012-09-29 03:06:30 -03:00
Pablo Hoffman
c380910b40 Merge pull request #167 from alexcepoi/sep-017
Spider contracts (SEP-017)
2012-09-28 13:57:07 -07:00
Alex Cepoi
11d29c7005 SEP-017 contracts: add tests and minor improvements 2012-09-21 00:12:46 +02:00
Pablo Hoffman
b46b5a6ef0 Documented AutoThrottle extension and added to extensions available by
default. Also deprecated concurrency and delay settings, in favour of
using the standard Scrapy ones.
2012-09-20 18:52:57 -03:00
Pablo Hoffman
c7f8219901 - removed scrapy.conf singleton from scrapy.log, scrapy.responsetypes,
scrapy.http.response.text, scrapy.selector
- fixed bug with scrapy.conf.settings backwards compatibility support
- added facility to notify (and provide some guidelines) about deprecated/obsolete settings
2012-09-19 03:03:34 -03:00
stav
303e13f616 selector documentation typos 2012-09-18 12:56:52 -05:00
Pablo Hoffman
3d736e657f fixed typo in doc 2012-09-18 10:51:01 -03:00
Pablo Hoffman
eed6eb49da make DBM the new default storage backend for HTTP cache middleware, simplified DBM storage backend code to avoid dealing with many spiders at once (not needed), and update httpcache stats names (hit -> hits, miss -> misses) 2012-09-17 10:11:07 -03:00
Pablo Hoffman
8f2dda12cc removed another instance of scrapy.conf.settings singleton, this time from scrapy.utils.trackref. From now on, trackrefs functionality will be always enabled as it imposes a very minimal performance overhead 2012-09-16 21:21:44 -03:00
Pablo Hoffman
81ed2d2d0b major Stats Collection refactoring: removed separation of global/per-spider stats, removed stats-related signals (stats_spider_opened, etc). Stats are much simpler now, backwards compatibility is kept on the Stats Collector API. 2012-09-14 12:31:33 -03:00
Pablo Hoffman
a874964ad4 renamed 'XPath Selectors' title to just 'Selectors' 2012-09-13 15:24:44 -03:00
Pablo Hoffman
acb8895e1a changed note in scrapyd doc to use sphinx notes 2012-09-13 15:22:59 -03:00
Pablo Hoffman
26f1d5cb48 Merge pull request #171 from artem-dev/scrapyd_job_times
added start and stop times for scrapyd list jobs web service
2012-09-13 11:16:01 -07:00
Artem Bogomyagkov
4d8f253912 commited doc file missed from prev commit 2012-09-12 11:12:59 +03:00