1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-28 08:03:51 +00:00

2616 Commits

Author SHA1 Message Date
Pablo Hoffman
b7c9503d9c disable lxml selectors tests which was failing on certain versions of lxml-libxml2 2010-10-26 16:50:34 -02:00
Pablo Hoffman
a09dd18bbb use absolute imports for compatibility with python 2.5 2010-10-26 16:37:16 -02:00
Martin Olveyra
41a85f9d14 Added support for variants when applied to tag attributes. fixed handling of variants with single attribute. Removed unneeded object attribute surrounds_variant. Added new tests cases for fixes. 2010-10-26 16:11:04 -02:00
Daniel Grana
a5a6c2ae31 Automated merge with ssh://hg.scrapy.org/scrapy-0.10 2010-10-25 15:57:56 -02:00
Daniel Grana
99096cdacf Image pipeline should upload images with image/jpeg content type. closes #257 2010-10-25 15:56:07 -02:00
Pablo Hoffman
df63577243 added lxml to setup.py install_requires, if using setuptools. refs #147 2010-10-25 14:57:50 -02:00
Pablo Hoffman
f7283ad18e skip lxml selector tests if lxml is not available. refs #147 2010-10-25 14:56:33 -02:00
Pablo Hoffman
a59bfb539d * Added lxml backend for XPath selectors. Closes #147
* Added new setting (SELECTORS_BACKEND) to choose which backend to use
* Deprecated the extract_unquoted() function from selectors
* Made libxml2 optional by adding a dummy selector backend. Closes #260

--HG--
rename : scrapy/tests/test_selector.py => scrapy/tests/test_selector_libxml2.py
2010-10-25 14:47:10 -02:00
Daniel Grana
7640e99979 test media_to_download mediapipeline hook. ref #269 2010-10-25 13:16:11 -02:00
Daniel Grana
fe1dd3a93d disconnect signals before uninstalling crawler in image tests 2010-10-23 05:10:52 -02:00
Daniel Grana
ad43917322 Add tests to MediaPipeline. closes #269
--HG--
extra : rebase_source : ccf726e147b5c97f7cba60d20ce2fca58c687a3e
2010-10-23 04:44:55 -02:00
Martin Olveyra
d17edd4a59 Fix wrong slice of tokens in recursive extraction of follow region 2010-10-22 14:21:52 -02:00
Daniel Grana
72d08383bb Automated merge with ssh://hg.scrapy.org/scrapy-0.10 2010-10-22 00:28:35 -02:00
Daniel Grana
2873c4d9fe SimpleDB stats doesn't use AWS auth from settings.py (thanks geoffwatts). closes #264 2010-10-22 00:21:54 -02:00
Pablo Hoffman
992683ac5c Deploy command requires project 2010-10-21 13:24:02 -02:00
Pablo Hoffman
f8b4d1dc5d Fixed compatibility with Python 2.5 2010-10-21 12:53:40 -02:00
Pablo Hoffman
6c921896a5 Expanded documentation on deploy command and versions. Refs #261 2010-10-19 00:11:45 -02:00
Pablo Hoffman
1d567cdce6 Added new 'deploy' command. Closes #261 2010-10-18 22:38:46 -02:00
Pablo Hoffman
7d8f922df9 Added documentation for CLOSESPIDER_ERRORCOUNT setting. Refs #254 2010-10-18 22:36:30 -02:00
Pablo Hoffman
c96f17c43d Automated merge with http://hg.scrapy.org/scrapy-0.10 2010-10-18 03:21:21 -02:00
Pablo Hoffman
deb9c7ef04 Reversed scrapy.cfg lookup order so that the one in the current project has more precedence. Also added alternative system-wide location for windows. 2010-10-18 03:18:54 -02:00
Pablo Hoffman
98662e53ea Formatting fix in Scrapyd doc 2010-10-17 03:20:23 -02:00
Pablo Hoffman
a3d85da96f Automated merge with http://hg.scrapy.org/scrapy-0.10 2010-10-16 19:54:24 -02:00
Pablo Hoffman
5f65c26080 Some minor improvements to feature list in Scrapy at a Glance documentation page 2010-10-16 19:02:08 -02:00
Pablo Hoffman
9d16ff09cb Automated merge with http://hg.scrapy.org/scrapy-0.10/ 2010-10-11 21:29:52 -02:00
Pablo Hoffman
f5b188b179 Make RetryMiddleware obey Request.meta 'dont_retry' key when processing exceptions. Closes #259 2010-10-11 21:28:42 -02:00
Pablo Hoffman
d5c8caf07b Automated merge with http://hg.scrapy.org/scrapy-0.10 2010-10-10 20:31:38 -02:00
Pablo Hoffman
b4fbc6c5fa Updated Scrapy Tutorial to reference feed exports, instead a custom written pipeline, and extended item pipeline documentation to include a JSON writer. 2010-10-10 20:31:05 -02:00
Pablo Hoffman
5267772d36 Automated merge with http://hg.scrapy.org/scrapy-0.10 2010-10-09 20:43:47 -02:00
Pablo Hoffman
0b91c04007 Fixed issue with non-standard line ending in HTTP headers. Closes #258 2010-10-09 20:43:05 -02:00
Martin Olveyra
a99bb0de9b Use binary flag in read/write operations, to fix tests in windows 2010-10-08 11:28:44 -02:00
Pablo Hoffman
aa4142e4ba Automated merge with http://hg.scrapy.org/scrapy-0.10 2010-10-07 18:23:48 -02:00
Pablo Hoffman
f4accb6c7f Updated dmoz xpaths of Scrapy tutorial 2010-10-07 18:22:01 -02:00
Martin Olveyra
a67a981cca Simplification of html parsing algorithm, fixed some tests (with new algorithm, comments inside bigger text region are generated separated from the text). Added test for a case not correctly handled by previous algorithm. Fixed test checking 2010-10-07 14:55:29 -02:00
Martin Olveyra
fafaee51d5 htmlpage tests reorganization and fixes: improved how differences between expected and result are shown, and check also correct parsing of tag_type 2010-10-07 14:55:00 -02:00
Pablo Hoffman
571aeb559b Automated merge with http://hg.scrapy.org/scrapy-0.10 2010-10-05 12:44:41 -02:00
Pablo Hoffman
4bbcbd7b77 Don't fail if twisted is not available on scrapy/__init__.py, to avoid making setup.py depend on Twisted. Closes #256 2010-10-05 12:43:34 -02:00
Pablo Hoffman
2d40705ea0 CloseSpider extension: Added support for closing spider after N errors have been raised. Closes #254 2010-09-30 20:17:44 -03:00
Pablo Hoffman
b6e7a38a3a Automated merge with http://hg.scrapy.org/scrapy-0.10 2010-09-29 13:37:33 -03:00
Pablo Hoffman
61ab9b86b7 Bumped version to 0.10.4 2010-09-29 13:36:36 -03:00
Pablo Hoffman
ad0f180dd9 Added tag 0.10.3 for changeset 803efdb19e0b 2010-09-29 13:34:56 -03:00
Pablo Hoffman
d15a97ff61 Updated Scrapy version in debian/changelog 2010-09-28 16:45:05 -03:00
Pablo Hoffman
7826869cb2 Added missing colon 2010-09-28 16:44:53 -03:00
Martin Santos
0bf9e4627c added support to CloseSpider extension, for close the spider after N pages have been crawled. Using the CLOSESPIDER_PAGECOUNT setting. closes #253 2010-09-28 16:29:37 -03:00
Pablo Hoffman
0976e0788e Automated merge with http://hg.scrapy.org/scrapy-0.10 2010-09-27 12:27:58 -03:00
Pablo Hoffman
49ffe528a3 Fixed listen_tcp function when receiving None or 0 in portrange argument. Closes #252 0.10.3 2010-09-27 12:27:32 -03:00
Pablo Hoffman
50e57b08b0 Automated merge with http://hg.scrapy.org/scrapy-0.10 2010-09-27 08:20:05 -03:00
Pablo Hoffman
9206806770 setup.py: added support for generating version from hg revision 2010-09-27 08:19:32 -03:00
Pablo Hoffman
51325fc93e Automated merge with http://hg.scrapy.org/scrapy-0.10 2010-09-27 07:57:23 -03:00
Pablo Hoffman
52d198afc9 Removed forked cookielib tests, because Python cookielib has been suffering several changes and maintaining a fork of the tests has become a pain. Instead, we've added specific tests for the urllib2 request/response wrappers 2010-09-27 07:55:27 -03:00