1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-25 18:24:30 +00:00

1250 Commits

Author SHA1 Message Date
Pablo Hoffman
239b056ddd fixed encoding problem in RegexLinkExtractor when extracting links using restrict_xpaths, and added unittests. also enabled previously disabled restrict_xpaths tests.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401100
2009-04-29 19:04:27 +00:00
Daniel Grana
e5892050bd templates: fix default item class path
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401099
2009-04-28 13:48:05 +00:00
Daniel Grana
cc62bba80e closedomain: check if domain was opened successfully before removing not existent task
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401098
2009-04-27 18:49:13 +00:00
Pablo Hoffman
940d0090af commented out irrelevant cookie test, and converted file to use encoded chars instead of real chars with a encoding declaration on top
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401097
2009-04-27 18:18:01 +00:00
Pablo Hoffman
f4e7ee5dac some improvements to XMLFeedSpider including namespaces patch submitted by talboito, refs #81
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401096
2009-04-27 17:34:42 +00:00
Daniel Grana
560dd3fd8e useragent: if spider user_agent attribute is None, use default useragent instead of not including one
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401095
2009-04-27 16:56:16 +00:00
Pablo Hoffman
78fa0bc2d8 increased log level of "Scraped" message to INFO
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401094
2009-04-27 15:20:46 +00:00
Pablo Hoffman
e72bc5918b added notice to pipelines.py project template about ITEM_PIPELINES settings
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401093
2009-04-27 14:09:33 +00:00
Pablo Hoffman
1eac716edf added COOKIES_DEBUG to default settings and doc
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401092
2009-04-27 12:26:04 +00:00
Pablo Hoffman
b553bb3dff removed TODO comment
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401091
2009-04-27 12:20:24 +00:00
Daniel Grana
8eaa9219f6 test: fix webclient test
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401090
2009-04-27 06:52:38 +00:00
Daniel Grana
e8ecc00cfb core: clean HTTP handling and add unittests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401089
2009-04-27 06:48:24 +00:00
Daniel Grana
0842fafd63 useragent: add unittests to mw and prepare to remove UA from downloader handlers
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401088
2009-04-27 06:47:26 +00:00
Daniel Grana
a86a89f41a cookies: mv cookies module under http module
--HG--
rename : scrapy/trunk/scrapy/utils/cookies.py => scrapy/trunk/scrapy/http/cookies.py
rename : scrapy/trunk/scrapy/tests/test_utils_cookies.py => scrapy/trunk/scrapy/tests/test_http_cookies.py
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401087
2009-04-27 06:46:46 +00:00
Daniel Grana
68f54a1a63 cookies: add test to check that wrapped cookiejar thread based lock is overrided
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401086
2009-04-27 06:45:49 +00:00
Daniel Grana
45f55c0df0 cookies: check mw outputs. refs #73
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401085
2009-04-23 13:27:04 +00:00
Daniel Grana
315a24add4 cookies: fix typo in cookiejar lock attribute name. refs #73
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401084
2009-04-22 23:05:15 +00:00
Pablo Hoffman
bec0b12270 changed settings doc according to reflect recent cookies refactoring
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401083
2009-04-22 20:03:56 +00:00
Daniel Grana
1b2c9e49d8 cookies: fix test import. refs #73
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401082
2009-04-22 17:52:43 +00:00
Daniel Grana
1ecf874a75 cookies: final new cookie middleware integration into core and enabled by default. refs #73
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401081
2009-04-22 17:21:46 +00:00
Daniel Grana
3de3c2a331 cookies: fix merging user cookies
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401080
2009-04-21 17:13:12 +00:00
Daniel Grana
4a6f7af739 cookies: now really add unittests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401079
2009-04-21 13:33:42 +00:00
Daniel Grana
1ce1481eab cookies: override cookielib cookiejar thread based lock by dummy lock
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401078
2009-04-21 13:32:19 +00:00
Daniel Grana
391dde86ee cookies: rename cookies dict as jars
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401077
2009-04-21 13:31:58 +00:00
Daniel Grana
48b436e81b cookies: add unittest and bugfix
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401076
2009-04-21 13:31:26 +00:00
Daniel Grana
799301b865 http: bugfix appendlist method not setting headers if first time
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401075
2009-04-21 13:30:55 +00:00
Pablo Hoffman
3170dd9fdb redirect mw: remove body and content-type/content-length headers on 302 redirects, and added tests. also renamed some tests to more meaningful names
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401074
2009-04-21 01:38:11 +00:00
Daniel Grana
5861b345c7 cookies: add cookiejar to wrap cookielib and remove cookielib from mw
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401073
2009-04-20 23:59:24 +00:00
Daniel Grana
0f22dfecb8 cluster: bugfix in remove and schedule actions
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401072
2009-04-20 12:22:05 +00:00
Pablo Hoffman
f6e83be60d docs/request-response.rst: added sphinx :param: tags to improve documentation structure and readability. also documented default behaviour for FormRequest changes commited in r1069
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401071
2009-04-20 02:31:22 +00:00
Pablo Hoffman
558a1a5033 FormRequest.from_response: fixed bug with setting method, override fields with those included in formdata, for those which already existed in the response <form>
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401070
2009-04-20 02:29:19 +00:00
Daniel Grana
be5106d730 core: strip not working close delay support from core to (working) extension
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401069
2009-04-17 20:43:50 +00:00
Daniel Grana
4d16571988 core: try to process requests after getting a DontCloseDomain to avoid delay on processing reinjected requests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401068
2009-04-17 19:13:38 +00:00
Pablo Hoffman
5f5962e6a4 improved SCHEDULER_ORDER setting doc
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401067
2009-04-17 13:06:31 +00:00
Daniel Grana
512b75942c cookies: add TODO comments to experimental cookie mw
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401066
2009-04-16 19:14:32 +00:00
Daniel Grana
60b330257a cookies: add new cookie middleware to experimental
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401065
2009-04-16 19:05:27 +00:00
Daniel Grana
d6c52d51ed cluster: promote new code as replacement of old pbcluster
--HG--
rename : scrapy/trunk/scrapy/contrib/pbcluster/crawler/__init__.py => scrapy/trunk/scrapy/contrib/cluster/crawler/__init__.py
rename : scrapy/trunk/scrapy/contrib_exp/cluster/crawler/manager.py => scrapy/trunk/scrapy/contrib/cluster/crawler/manager.py
rename : scrapy/trunk/scrapy/contrib_exp/cluster/hooks/__init__.py => scrapy/trunk/scrapy/contrib/cluster/hooks/__init__.py
rename : scrapy/trunk/scrapy/contrib_exp/cluster/hooks/svn.py => scrapy/trunk/scrapy/contrib/cluster/hooks/svn.py
rename : scrapy/trunk/scrapy/contrib/pbcluster/master/__init__.py => scrapy/trunk/scrapy/contrib/cluster/master/__init__.py
rename : scrapy/trunk/scrapy/contrib_exp/cluster/master/manager.py => scrapy/trunk/scrapy/contrib/cluster/master/manager.py
rename : scrapy/trunk/scrapy/contrib_exp/cluster/master/web.py => scrapy/trunk/scrapy/contrib/cluster/master/web.py
rename : scrapy/trunk/scrapy/contrib/pbcluster/master/ws_api.txt => scrapy/trunk/scrapy/contrib/cluster/master/ws_api.txt
rename : scrapy/trunk/scrapy/contrib/pbcluster/tools/scrapy-cluster-ctl.py => scrapy/trunk/scrapy/contrib/cluster/tools/scrapy-cluster-ctl.py
rename : scrapy/trunk/scrapy/contrib_exp/cluster/tools/test-worker.py => scrapy/trunk/scrapy/contrib/cluster/tools/test-worker.py
rename : scrapy/trunk/scrapy/contrib/pbcluster/worker/__init__.py => scrapy/trunk/scrapy/contrib/cluster/worker/__init__.py
rename : scrapy/trunk/scrapy/contrib_exp/cluster/worker/manager.py => scrapy/trunk/scrapy/contrib/cluster/worker/manager.py
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401064
2009-04-16 18:58:56 +00:00
Daniel Grana
821f6be3ce http: fix xmlrpc request cloning
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401063
2009-04-16 18:23:12 +00:00
Pablo Hoffman
bbd41bda33 more cleanup of default project settings templates and added a notice as suggested by Mark Ellul
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401062
2009-04-16 15:32:06 +00:00
Pablo Hoffman
3fce2496c7 modified page <title>
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401061
2009-04-15 01:31:04 +00:00
Daniel Grana
d18b78b8fe site: download section is linking to very old scrapy zip tarball
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401060
2009-04-14 11:20:52 +00:00
Daniel Grana
fae3d7a8dc core: adapt redirection and media scheduler priorities
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401059
2009-04-13 20:27:35 +00:00
Daniel Grana
b96a0c2756 core: change default priority to 0 to use balanced priorityqueue, and increase priority of redirected requests so memory doesnt hog because of redirection requests waiting for others requests to finish before they got a chance to be downloaded
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401058
2009-04-13 16:42:26 +00:00
Pablo Hoffman
29b3746def updated some homepage texts
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401057
2009-04-12 10:21:00 +00:00
Pablo Hoffman
743936062d removed unused code from Makefile
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401056
2009-04-12 10:20:46 +00:00
Pablo Hoffman
74f39480c6 added a couple of unittests for from_response errors
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401055
2009-04-12 09:16:31 +00:00
Pablo Hoffman
54ad49f765 doc: fixed a couple of broken links
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401054
2009-04-12 09:05:00 +00:00
Pablo Hoffman
50efaab447 - added from_response() class method to FormRequest to support pre-populating
HTML forms with fields taken from <form> elements contained in responses.
  implemented using the ClientForm library

- added ClientForm to Scrapy bundled libraries (scrapy.xlib)

- added unittests for new from_response() method

- documented new from_response() method, added a user login example to
  illustrate it, and a new faq entry

- improved overall quality of request/response doc

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401053
2009-04-12 08:31:55 +00:00
Pablo Hoffman
5378a35197 some more tuning to installation guide
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401052
2009-04-11 18:34:44 +00:00
Pablo Hoffman
b5bbf827b7 added tips about using Firefox addons to inspect the live browser DOM
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401051
2009-04-11 06:44:09 +00:00