1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-26 11:03:45 +00:00

3297 Commits

Author SHA1 Message Date
Rolando Espinoza La fuente
6ed44a3a67 httpcache: added tests for custom dbm module.
And removed the break in the line to improve readability as there
is no strict 80-chars line width convention.
2013-02-01 12:41:52 -04:00
Rolando Espinoza La fuente
a2602df90b httpcache: allow to import submodules within packages as HTTPCACHE_DBM_MODULE setting. 2013-01-31 23:49:36 -04:00
Daniel Graña
e5edb8ec3e Merge branch 'dangra/issue-24' 2013-01-30 16:55:53 -02:00
Daniel Graña
3af240f5f9 Merge branch 'dangra/issue-12' 2013-01-30 16:55:16 -02:00
Daniel Graña
0d3e4b4c43 do not unquote slash and question mark in url paths. fix #24 2013-01-30 11:22:03 -02:00
Daniel Graña
872a22df68 pep8ize sgml link extractors 2013-01-29 18:08:32 -02:00
Daniel Graña
cc69b3aa4c Fix #199 encoding error concatenating link text 2013-01-29 17:56:34 -02:00
Daniel Graña
95fde0a498 register namespaces when XMLFeedSpider uses iternodes mode. fixes #12 2013-01-29 12:00:13 -02:00
Daniel Graña
8e77f27897 Find form nodes in invalid html5 documents
lxml fails to parse invalid html5 documents
This error was reported in scrapy/loginform#3
2013-01-24 17:47:53 -02:00
Daniel Graña
ff04480675 fix exporting nested items as xml. fixes #66 2013-01-24 14:57:53 -02:00
Pablo Hoffman
71d7df9273 simplify InitSpider implementation, fixing one bug (closes #228) and adding support generators return from start_requests() (which the previous version didn't) 2013-01-23 16:28:41 -02:00
Daniel Graña
3cf7f4975b Add 0.16.4 to release notes
Conflicts:
	docs/news.rst
2013-01-23 11:29:38 -02:00
Daniel Graña
c40f947dc1 Merge pull request #229 from christilden/master
fixes spelling errors in documentation
2013-01-22 20:27:40 -08:00
Chris Tilden
aae6aed4fb fixes spelling errors in documentation 2013-01-22 14:52:18 -08:00
Pablo Hoffman
27583922a7 remove unused imports 2013-01-21 14:01:12 -02:00
Pablo Hoffman
65258e3621 Merge pull request #227 from tonal/correct-old-cache
Correct init bag for load FilesystemCacheStorage from old location
2013-01-21 04:57:19 -08:00
Alexandr N Zamaraev (aka tonal)
2c51266a40 Correct init bag for load old scrapy.contrib.httpcache.FilesystemCacheStorage 2013-01-21 13:54:36 +07:00
Pablo Hoffman
6ab8afb992 improve documentation about removing namespaces 2013-01-18 12:35:30 -02:00
Pablo Hoffman
1ba04b1fc3 added remove_namespaces() method to XmlXPathSelector objects 2013-01-18 12:20:03 -02:00
Pablo Hoffman
b7eeeff410 get rid of assertDictEqual (since it's python 2.7+ only) 2013-01-17 13:18:23 -02:00
Pablo Hoffman
c31441a273 revert default HTTP cache policy to dummy (instead of RFC2616) 2013-01-17 13:08:29 -02:00
Daniel Graña
897195186a document new FormRequest parameter named formxpath that matches forms using xpath 2013-01-08 18:36:20 -02:00
Daniel Graña
7527ef97ba Merge pull request #185 from notsobad/master
Added xpath support in FormRequest.from_response
2013-01-08 12:34:06 -08:00
Daniel Graña
75563b3f00 Add list of supported and missing RFC2616 caching features 2013-01-08 18:16:44 -02:00
Daniel Graña
3cbc4d0b94 django is an optional_features, its imports must not fail 2013-01-08 17:56:46 -02:00
Daniel Graña
d8a760bf57 Merge branch 'http-cache-middleware'
Conflicts:
	scrapy/contrib/downloadermiddleware/httpcache.py
	scrapy/contrib/httpcache.py
	scrapy/tests/test_downloadermiddleware_httpcache.py
2013-01-08 17:34:48 -02:00
Daniel Graña
864a7aef87 More httpcache updates
* Change default cache policy to RFC2616
* Update HttpCacheMiddleware documentation
* Move policies to scrapy.contrib.httpcache
* remove a lint error for .has_key() usage in DBM storage backend
2013-01-08 17:26:32 -02:00
Daniel Graña
487299e068 TakeFirst doc says it returns first non-null/non-empty value, zero is a valid value. closes #59 2013-01-08 15:47:33 -02:00
Daniel Graña
672d09ea2e add meta-refresh changes to release notes 2013-01-08 12:30:36 -02:00
Daniel Graña
9527c5819a pep8ize settings 2013-01-08 11:48:36 -02:00
Daniel Graña
defc4f89b5 update metarefresh settings 2013-01-08 11:41:19 -02:00
Daniel Graña
6a2b23883a Add MetaRefreshMiddleware docs 2013-01-08 11:25:38 -02:00
Daniel Graña
076ba40404 update DOWNLOADER_MIDDLEWARES_BASE setting documentation 2013-01-08 10:50:27 -02:00
Daniel Graña
71db7f1b25 Split redirection into status and metarefresh middlewares, also changes httpcompression priority. closes #78 2013-01-08 09:59:38 -02:00
Rolando Espinoza La fuente
fe5d0ce2e0 tests: added downloader middleware manager integration tests for gzipped redirection. 2013-01-08 09:59:38 -02:00
Pablo Hoffman
227a1d666b add doc about disabling an extension. refs #132 2013-01-07 13:16:19 -02:00
Pedro Faustino
5d3a4d755f Update downloader middleware documentation 2013-01-06 18:53:14 +00:00
Pedro Faustino
59dc71f394 Merge branch 'http-cache-middleware', remote-tracking branch 'dangra/http-cache-middleware' into http-cache-middleware 2013-01-06 17:53:25 +00:00
Pablo Hoffman
7f990a4af2 Merge pull request #221 from emschorsch/patch-1
Proposed Changes to DjangoItem documentation
2013-01-04 14:17:54 -08:00
Emanuel Schorsch
f9b130da12 Proposed Changes
I was very confused as to how you actually import DjangoItem.
I searched extensively on the internet looking for actual code so I could see how it worked.
I finally found http://blog.just2us.com/2012/07/setting-up-django-with-scrapy/. It is much easier to understand with full files instead of code fragments.
I also edited where it says "we can see that the model is already saved" as I don't see how it's already saved.
2013-01-04 15:59:04 -05:00
Pablo Hoffman
acb7bad1ff Merge pull request #218 from Mimino666/django-item-validation
Django item validation
2013-01-04 10:35:20 -08:00
Daniel Graña
3f03a2ca50 requests with no-cache set must force revalidation of cached responses 2013-01-04 04:20:45 -02:00
Daniel Graña
cdecc760ee default httpcache to rfc2616 policy and improve storage and policy tests 2013-01-04 03:43:25 -02:00
Daniel Graña
0a5586fafd move FilesystemCacheStorage to scrapy.contrib.httpcache 2013-01-03 09:23:18 -02:00
Pablo Hoffman
9f003a73da Merge pull request #217 from Mimino666/item-processing-errormsg
Fixed error message formatting.
2013-01-02 21:04:40 -08:00
Michal Danilak
2cfbd13c17 Added "exclude" parameter testing to unittests. 2013-01-03 02:02:03 +01:00
Michal Danilak
035d1e99a9 Added model validation to DjangoItem. 2013-01-03 01:52:53 +01:00
Michal Danilak
8ea89b277c Fixed error message formatting.
log.err() doesn't support cool formatting and when error occured, the message was:
	"ERROR: Error processing %(item)s"
2013-01-02 22:55:04 +01:00
Pablo Hoffman
ea0967562e Merge pull request #216 from Mimino666/shell-spider-option
Added --spider option to "shell" command.
2013-01-02 09:15:15 -08:00
Daniel Graña
a6ef76ed88 lint and improve images pipeline error logging 2013-01-02 14:14:37 -02:00