1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-28 04:24:25 +00:00

3981 Commits

Author SHA1 Message Date
duendex
36e4fc3785 Removed some trailing spaces that I left. 2013-12-02 20:19:42 -02:00
duendex
ae28c7d698 Adds the functionality to do HTTPS downloads behind proxies using an
HTTP CONNECT.
2013-12-02 20:19:41 -02:00
Pablo Hoffman
f2741c413e fix method name in tutorial. closes GH-480 2013-12-02 13:24:12 -02:00
Nikolay Golub
a651a75248 fix logging error with unicode spider name
Log message fails if spider name is in unicode, because "system" key in eventDict isn't encoded.
2013-11-30 21:08:56 +04:00
Daniel Graña
e34ffc0f42 Add 0.20.1 release notes
Conflicts:
	docs/news.rst
2013-11-28 16:25:57 -02:00
Daniel Graña
cfe588103c include_package_data is required to build wheels from published sources 2013-11-28 16:25:57 -02:00
Paul Tremberth
13002564b9 Add tests for OffsiteMiddleware() + use re.escape() in domains regexp 2013-11-28 15:40:22 +01:00
Pablo Hoffman
339861367e Merge pull request #425 from audiodude/master
DownloaderMiddleware docs: Update process_request and minor cleanups.
2013-11-25 10:33:35 -08:00
Daniel Graña
aeeba2147b travis-ci updated to pypy 2.2 2013-11-25 13:57:40 -02:00
Daniel Graña
36c8da2ad6 Merge pull request #461 from redapple/selectorloader
Add "unified" SelectorItemLoader (supports .add_css() and .add_xpath())
2013-11-22 12:10:39 -08:00
Pablo Hoffman
545f2601b0 Merge pull request #469 from redapple/xgzip
Remove "x-gzip" from Requests' "Accept-Encoding" header
2013-11-21 10:55:07 -08:00
Paul Tremberth
0d99babe40 Remove "x-gzip" from Requests' "Accept-Encoding" header 2013-11-21 18:58:24 +01:00
Paul Tremberth
14f5817d6b Modify ItemLoader to support XPath and CSS selectors
Deprecate XPathItemLoader (now an alias to the new ItemLoader)
2013-11-21 18:05:24 +01:00
Pablo Hoffman
f87be371a2 better names for HANDLE_* settings, and added doc 2013-11-21 14:33:17 -02:00
Daniel Graña
ab01e9e9e4 Merge pull request #466 from kalessin/httperror
allow to use settings for defining http error handling defaults
2013-11-21 03:55:44 -08:00
Martin Olveyra
55bee912a2 allow to use settings for defining http error handling defaults 2013-11-20 20:12:49 -02:00
Mikhail Korobov
8416cc7515 Merge pull request #465 from bjlange/master
Add note to item-pipeline documentation explaining order
2013-11-20 09:40:19 -08:00
Brian Lange
e4c1d8d37d Elaborate on use of order numbers 2013-11-19 17:51:50 -06:00
Daniel Graña
2564c21d4c add a tox env for Python 3.3 2013-11-19 20:15:15 -02:00
Daniel Graña
526a944eda lxml is required, no need to skip tests. 2013-11-19 20:14:48 -02:00
Daniel Graña
3f156ad845 Do not call body_as_unicode on non text responses. closes #462 2013-11-19 20:13:34 -02:00
Brian Lange
b878f60b5a Add note to item-pipeline documentation explaining order in the ITEM_PIPELINES setting. 2013-11-19 16:12:54 -06:00
Daniel Graña
ec7833a910 Deprecate body_or_str helper function only used by xml iterators 2013-11-19 19:21:54 -02:00
Pablo Hoffman
2d91c7136d Merge pull request #464 from kalessin/telnet
telnet client: fix unexisting reference to engine.slots
2013-11-19 05:22:58 -08:00
olveyra
755b9ba5a4 telnet client: fix unexisting reference to engine.slots 2013-11-19 04:52:24 +01:00
Pablo Hoffman
afe6eaa2fe Merge pull request #460 from tntC4stl3/master
duplicate 'use' in line 87
2013-11-15 04:10:49 -08:00
tntC4stl3
b51d5d81e4 duplicate 'use' in line 87 2013-11-15 13:56:44 +08:00
Daniel Graña
c74903f9da process_parallel was leaking the failures on its internal deferreds. closes #458
DeferredList implemented cancellation in Twisted 13.2.0 by holding a
reference to the affected deferreds objects, if a deferred errored the
result was propagated to the DeferredList but still referenced by the
original deferred and nobody was consuming it.

The tests started to fail because the reference from DeferredList
prevented the underlining deferred from been collected before the test
finish invalidating the effect of self.flushedLoggedErrors() call.
2013-11-09 02:12:52 -02:00
Daniel Graña
04ff7ecebf improve 0.20 release notes
Conflicts:
	docs/news.rst
2013-11-08 17:45:03 -02:00
Daniel Graña
3d18a3c49e bumped version to 0.21.0 0.21.0 2013-11-08 17:09:00 -02:00
Daniel Graña
d0980e5c9b Merge 0.20 release notes 2013-11-08 17:06:10 -02:00
Daniel Graña
60516123a5 Merge branch 'travix-toxed'
Conflicts:
	requirements.txt
2013-11-07 11:29:58 -02:00
Daniel Graña
d29791d7ab building Pillow with pypy requries dev headers 2013-11-07 10:58:08 -02:00
Daniel Graña
971f60d796 TOXENV is tox supported env 2013-11-07 10:58:08 -02:00
Daniel Graña
ecfa743105 install updated pypy from ppa 2013-11-07 10:58:08 -02:00
Daniel Graña
fabb351097 map travis-ci matrix to tox environments 2013-11-07 10:58:08 -02:00
Daniel Graña
bc7fa61136 Django 1.6 form validation errors now include ValidationError exception instances instead of just strings 2013-11-07 10:36:39 -02:00
Rolando Espinoza La fuente
6f5423aebd Replaced remaning __import__(module) calls.
This commit replaces the statements __import__(module) as the previous
replaced the statements __import__(module, {}, {}, ['']).

At first I thought leaving the single-argument calls, but perhaps it's
better to be strict rather than having exceptions to the rule in this
case.
2013-11-07 10:36:39 -02:00
Rolando Espinoza La fuente
6b1760d7a1 replaced __import__ by importlib.import_module.
Since python 2.7, importlib.import_module is the recommended way to
import modules programmatically.

From __import__'s doc:

    Import a module. Because this function is meant for use by the
Python
    interpreter and not for general use it is better to use
    importlib.import_module() to programmatically import a module.
2013-11-07 10:36:39 -02:00
Daniel Graña
b6bed44c2b Django 1.6 form validation errors now include ValidationError exception instances instead of just strings 2013-11-07 02:32:10 -02:00
Daniel Graña
b78e76108f Merge pull request #445 from darkrho/import-module
Use `importlib.import_module` instead of `__import__`
2013-11-06 20:14:27 -08:00
Daniel Graña
2318c56f14 shutdown the active crawler on SIGINT. fixes #450 2013-11-05 00:02:17 -02:00
Mikhail Korobov
f80f10ae7e Merge pull request #452 from alexanderlukanin13/python3
PY3: scrapy.__version__, NoneType, urlparse_monkeypatches
2013-11-04 00:54:02 -08:00
alexanderlukanin13
6c7292a08e python3: scrapy.__version__, NoneType, urlparse_monkeypatches 2013-11-03 22:20:33 +06:00
Rolando Espinoza La fuente
d1b912890b Merge branch 'master' into import-module 2013-10-30 21:07:53 -04:00
Pablo Hoffman
fa245af6d2 Merge pull request #448 from dangra/drop-py26
Drop Python 2.6 support
2013-10-30 05:53:06 -07:00
Daniel Graña
2df8156431 Drop Python 2.6 support 2013-10-29 13:44:00 -02:00
Rolando Espinoza La fuente
10e22aa5fb Replaced remaning __import__(module) calls.
This commit replaces the statements __import__(module) as the previous
replaced the statements __import__(module, {}, {}, ['']).

At first I thought leaving the single-argument calls, but perhaps it's
better to be strict rather than having exceptions to the rule in this
case.
2013-10-27 19:10:25 -04:00
Rolando Espinoza La fuente
343f997ed6 replaced __import__ by importlib.import_module.
Since python 2.7, importlib.import_module is the recommended way to
import modules programmatically.

From __import__'s doc:

    Import a module. Because this function is meant for use by the
Python
    interpreter and not for general use it is better to use
    importlib.import_module() to programmatically import a module.
2013-10-27 18:33:51 -04:00
Daniel Graña
bd79b6e1d3 debian package requires python-cssselect 2013-10-24 16:47:46 -02:00