Pablo Hoffman
7400ceb1ed
added 502 to RETRY_HTTP_CODES
2013-02-22 19:12:59 -02:00
Pablo Hoffman
a038f46859
doc: fixed rst title
2013-02-14 11:11:17 -02:00
Pablo Hoffman
22edc44c6c
doc: remove links to diveintopython.org, which is no longer available. closes #246
2013-02-14 11:09:40 -02:00
Daniel Graña
5db45b3825
remove scrapyd, it was migrated to its own repository
2013-02-06 05:24:07 +00:00
whodatninja
8e3b5baac5
Fix typo labeling attrs type bool instead of list
2013-02-05 15:10:41 -05:00
Chris Tilden
aae6aed4fb
fixes spelling errors in documentation
2013-01-22 14:52:18 -08:00
Pablo Hoffman
6ab8afb992
improve documentation about removing namespaces
2013-01-18 12:35:30 -02:00
Pablo Hoffman
1ba04b1fc3
added remove_namespaces() method to XmlXPathSelector objects
2013-01-18 12:20:03 -02:00
Pablo Hoffman
c31441a273
revert default HTTP cache policy to dummy (instead of RFC2616)
2013-01-17 13:08:29 -02:00
Daniel Graña
897195186a
document new FormRequest parameter named formxpath
that matches forms using xpath
2013-01-08 18:36:20 -02:00
Daniel Graña
75563b3f00
Add list of supported and missing RFC2616 caching features
2013-01-08 18:16:44 -02:00
Daniel Graña
d8a760bf57
Merge branch 'http-cache-middleware'
...
Conflicts:
scrapy/contrib/downloadermiddleware/httpcache.py
scrapy/contrib/httpcache.py
scrapy/tests/test_downloadermiddleware_httpcache.py
2013-01-08 17:34:48 -02:00
Daniel Graña
864a7aef87
More httpcache updates
...
* Change default cache policy to RFC2616
* Update HttpCacheMiddleware documentation
* Move policies to scrapy.contrib.httpcache
* remove a lint error for .has_key() usage in DBM storage backend
2013-01-08 17:26:32 -02:00
Daniel Graña
defc4f89b5
update metarefresh settings
2013-01-08 11:41:19 -02:00
Daniel Graña
6a2b23883a
Add MetaRefreshMiddleware docs
2013-01-08 11:25:38 -02:00
Daniel Graña
076ba40404
update DOWNLOADER_MIDDLEWARES_BASE setting documentation
2013-01-08 10:50:27 -02:00
Pablo Hoffman
227a1d666b
add doc about disabling an extension. refs #132
2013-01-07 13:16:19 -02:00
Pedro Faustino
5d3a4d755f
Update downloader middleware documentation
2013-01-06 18:53:14 +00:00
Emanuel Schorsch
f9b130da12
Proposed Changes
...
I was very confused as to how you actually import DjangoItem.
I searched extensively on the internet looking for actual code so I could see how it worked.
I finally found http://blog.just2us.com/2012/07/setting-up-django-with-scrapy/ . It is much easier to understand with full files instead of code fragments.
I also edited where it says "we can see that the model is already saved" as I don't see how it's already saved.
2013-01-04 15:59:04 -05:00
Natan L
d572f8945e
Fixed typo
...
'persitent' --> 'persistent'
2012-12-31 11:14:01 -08:00
Pedro Faustino
492831fc6f
Merge branch 'master' of git://github.com/scrapy/scrapy into http-cache-middleware
2012-12-28 15:27:45 +01:00
Hasnain Lakhani
93a1102189
Implemented policies for HTTP Cache
2012-12-26 16:29:48 -08:00
Pablo Hoffman
51b8feb4ce
fixed doc typos
2012-12-26 16:16:53 -02:00
Pablo Hoffman
1e2ee76df2
add documentation topics: Broad Crawls & Common Practies
2012-12-26 14:02:13 -02:00
Pedro Faustino
fdaa35f6e8
Updated the downloader middleware documentation to reflect changes introduced by the support for real HTTP caching.
2012-12-24 19:37:53 +01:00
Pablo Hoffman
12475fccbe
Merge pull request #206 from dangra/downloader-enhancements
...
AutoThrottle and Downloader enhancements
2012-12-17 10:11:46 -08:00
Daniel Graña
d7daf836d5
Altering delay is enough to auto throttle downloads
2012-12-17 16:08:49 -02:00
Luan
5582ea28ec
Update docs/topics/commands.rst
...
A short change.
2012-12-10 15:16:02 -02:00
Pablo Hoffman
39274a2457
doc: removed obsolete references to ClientForm
2012-11-23 19:06:47 -02:00
stav
99f164fc87
correct docs for default storage backend
2012-11-22 14:05:47 -06:00
Ilya Baryshev
097aea04a4
Fixed docs typo in SpiderOpenCloseLogging example
2012-11-10 12:24:53 +04:00
Pablo Hoffman
aa0e02dc54
added open_in_browser to debugging doc
2012-11-04 19:58:06 -02:00
Pablo Hoffman
7a7c5d1334
removed reference to global scrapy stats from settings doc
2012-11-03 17:05:01 -02:00
Pablo Hoffman
8f4c879b58
extended documentation on how to access crawler stats from extensions
2012-10-25 11:28:23 -02:00
Pablo Hoffman
1a905d62f5
removed scrapy.log.started attribute, and avoid checking if log has already been started (since it should be called once anyway)
2012-10-09 16:05:19 -02:00
Pablo Hoffman
1f89eb59fe
fixed doc reference to topics-contracts
2012-10-09 16:02:12 -02:00
Pablo Hoffman
7458092eef
added spider contracts to release notes and warn that its API is still subject to change
2012-09-29 03:06:30 -03:00
Pablo Hoffman
c380910b40
Merge pull request #167 from alexcepoi/sep-017
...
Spider contracts (SEP-017)
2012-09-28 13:57:07 -07:00
Alex Cepoi
11d29c7005
SEP-017 contracts: add tests and minor improvements
2012-09-21 00:12:46 +02:00
Pablo Hoffman
b46b5a6ef0
Documented AutoThrottle extension and added to extensions available by
...
default. Also deprecated concurrency and delay settings, in favour of
using the standard Scrapy ones.
2012-09-20 18:52:57 -03:00
Pablo Hoffman
c7f8219901
- removed scrapy.conf singleton from scrapy.log, scrapy.responsetypes,
...
scrapy.http.response.text, scrapy.selector
- fixed bug with scrapy.conf.settings backwards compatibility support
- added facility to notify (and provide some guidelines) about deprecated/obsolete settings
2012-09-19 03:03:34 -03:00
stav
303e13f616
selector documentation typos
2012-09-18 12:56:52 -05:00
Pablo Hoffman
3d736e657f
fixed typo in doc
2012-09-18 10:51:01 -03:00
Pablo Hoffman
eed6eb49da
make DBM the new default storage backend for HTTP cache middleware, simplified DBM storage backend code to avoid dealing with many spiders at once (not needed), and update httpcache stats names (hit -> hits, miss -> misses)
2012-09-17 10:11:07 -03:00
Pablo Hoffman
8f2dda12cc
removed another instance of scrapy.conf.settings singleton, this time from scrapy.utils.trackref. From now on, trackrefs functionality will be always enabled as it imposes a very minimal performance overhead
2012-09-16 21:21:44 -03:00
Pablo Hoffman
81ed2d2d0b
major Stats Collection refactoring: removed separation of global/per-spider stats, removed stats-related signals (stats_spider_opened, etc). Stats are much simpler now, backwards compatibility is kept on the Stats Collector API.
2012-09-14 12:31:33 -03:00
Pablo Hoffman
a874964ad4
renamed 'XPath Selectors' title to just 'Selectors'
2012-09-13 15:24:44 -03:00
Pablo Hoffman
acb8895e1a
changed note in scrapyd doc to use sphinx notes
2012-09-13 15:22:59 -03:00
Pablo Hoffman
26f1d5cb48
Merge pull request #171 from artem-dev/scrapyd_job_times
...
added start and stop times for scrapyd list jobs web service
2012-09-13 11:16:01 -07:00
Artem Bogomyagkov
4d8f253912
commited doc file missed from prev commit
2012-09-12 11:12:59 +03:00