1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-26 07:44:38 +00:00

1023 Commits

Author SHA1 Message Date
Daniel Grana
186d0e93b0 Revert "added rule shorthand, for creating CrawlSpider rules"
This reverts commit r863.

This is not the place for this shorthand, and it hasn't the blessing of
BDFL. I look forward for the return of this shorthand.

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401023
2009-03-27 06:46:06 +00:00
Daniel Grana
7227b65de1 http: fix copy and failing appendlist method of Headers, also add missing tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401022
2009-03-27 06:42:24 +00:00
Daniel Grana
1d0e2d1202 linkextractors: add arg_to_iter support to RegexLinkExtractor
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401021
2009-03-27 06:05:54 +00:00
Daniel Grana
4fa7e6cac4 tests: fix images tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401020
2009-03-27 05:42:47 +00:00
Daniel Grana
5b7cfb3184 pipeline: refactor image pipelines moving common functionality to BaseImagesPipeline
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401019
2009-03-27 04:41:11 +00:00
Daniel Grana
70a686c02b http: add errback to Request constructor
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401018
2009-03-25 13:15:55 +00:00
Daniel Grana
828c3e0988 core: remove deferred_degenerator, instead use a cooperative map
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401017
2009-03-25 13:15:24 +00:00
Daniel Grana
9d59bb3153 utils: remove unused function arg_to_list
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401016
2009-03-25 13:14:50 +00:00
Daniel Grana
de0d57cd9b log: add an twisted.python.log.err handler wrapper just like log.msg one
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401015
2009-03-25 13:14:26 +00:00
Pablo Hoffman
7c4c17088a added copy tests for Headers which need to pass but are currently failing
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401014
2009-03-24 21:21:39 +00:00
Pablo Hoffman
d8c0440700 fixed bug in previous commit
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401013
2009-03-24 21:03:38 +00:00
Pablo Hoffman
0c9631594d - made Request.copy() use Request.replace()
- added callback, dont_filter, encoding to Request copy/replace
- fixed Request.replace() and Response.replace() which weren't working properly with empty arguments
- added new test cases
- added doc about copying requests and deferreds

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401012
2009-03-24 20:02:42 +00:00
Pablo Hoffman
03fc6fc477 made scrapy installation guide a bit more friendly for windows users
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401011
2009-03-22 22:05:23 +00:00
Pablo Hoffman
30a3df0c9d added documentation about --set and small improvement
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401010
2009-03-22 19:25:08 +00:00
Pablo Hoffman
cbe8f8ff45 moved spider loading below command line option processing, so options overrided by --set argument are taken into account at spider module-level code
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401009
2009-03-22 19:19:31 +00:00
Pablo Hoffman
28f947d2b9 added support for overriding settings from command line arguments, as suggested by Mark Ellul
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401008
2009-03-22 19:18:27 +00:00
Pablo Hoffman
482ec0c28c fixed bug in passing request callback arguments example code
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401007
2009-03-22 16:27:35 +00:00
Pablo Hoffman
f4bd8a48dc added doc about passing arguments to request callbacks, as suggested by tarasm
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401006
2009-03-22 16:24:56 +00:00
Pablo Hoffman
c476d586ea fixed typo in signals doc. closes #75
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401005
2009-03-22 16:22:57 +00:00
Daniel Grana
0b1184e472 redirectmw: allow dont_filter requests to continue redirecting but limit max redirections by global setting and per request
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401004
2009-03-20 19:38:32 +00:00
Daniel Grana
87e63216ef retrymw: remove fingerprint cache and use request meta instead, add more tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401003
2009-03-20 19:37:55 +00:00
Andres Moreira
cab6387c87 profiling: add new priorityqueue implementation (PQ5c) with cache support. fixed bug in run.py with the use of the with statement.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401002
2009-03-19 13:11:27 +00:00
Daniel Grana
bd1be0ac67 profiling: fix priority distribution and add new ones
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401001
2009-03-19 12:04:19 +00:00
Daniel Grana
2e63ef60f1 profiling: remove useless import
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401000
2009-03-19 05:42:23 +00:00
Daniel Grana
fdfa5302a4 profiling: minor fixes and new testcase for IndexError exception
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40999
2009-03-19 05:18:35 +00:00
Daniel Grana
9d1855a66e profiling: add alternative priorityqueue implementations and disable not useful ones from default runs
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40998
2009-03-19 04:09:24 +00:00
Daniel Grana
1c659ce615 profiling: add list+deque+cache+islice just to compare with pure slice
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40997
2009-03-19 03:33:06 +00:00
Daniel Grana
685807e3c4 profiling: improve list+deque+cache performance
results:

 == With 1 priorities (/tmp/pq-13882-1-50000) ==

 pushpops = 50000, times = 30
 heapq implementation: 12.6956150532
 dict+deque implementation: 5.3080239296
 deque+heapq implementation: 3.11057305336
 deque+defaultdict+deque implementation: 3.06583619118
 list+deque implementation: 4.85028195381
 list+deque+cache implementation: 4.86092495918

 == With 3 priorities (/tmp/pq-13882-3-50000) ==

 pushpops = 50000, times = 30
 heapq implementation: 13.9048631191
 dict+deque implementation: 6.526501894
 deque+heapq implementation: 9.95749187469
 deque+defaultdict+deque implementation: 4.94318699837
 list+deque implementation: 5.48832702637
 list+deque+cache implementation: 4.77395009995

 == With 5 priorities (/tmp/pq-13882-5-50000) ==

 pushpops = 50000, times = 30
 heapq implementation: 14.1862449646
 dict+deque implementation: 7.45535206795
 deque+heapq implementation: 11.7175529003
 deque+defaultdict+deque implementation: 5.40972518921
 list+deque implementation: 5.87488412857
 list+deque+cache implementation: 4.73579287529

 == With 10 priorities (/tmp/pq-13882-10-50000) ==

 pushpops = 50000, times = 30
 heapq implementation: 14.2052979469
 dict+deque implementation: 9.94834208488
 deque+heapq implementation: 13.0460109711
 deque+defaultdict+deque implementation: 5.79300785065
 list+deque implementation: 6.9981739521
 list+deque+cache implementation: 4.81988596916

 == With 100 priorities (/tmp/pq-13882-100-50000) ==

 pushpops = 50000, times = 30
 heapq implementation: 14.9574189186
 dict+deque implementation: 55.6348400116
 deque+heapq implementation: 14.9515259266
 deque+defaultdict+deque implementation: 10.6776599884
 list+deque implementation: 25.9212520123
 list+deque+cache implementation: 4.77596998215

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40996
2009-03-19 03:11:27 +00:00
Daniel Grana
44ba741277 profiling: automatically test include new PriorityQueue classes in testcases
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40995
2009-03-19 02:47:59 +00:00
Daniel Grana
e1c1865c2a profiling: add options to priority queue profiler
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40994
2009-03-19 02:25:20 +00:00
Daniel Grana
22c51638a8 profiling: add priorityqueue alternatives and profiling / test cases
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40993
2009-03-18 19:44:36 +00:00
Andres Moreira
2951e8425f Improved PriorityQueue and PriorityStack that boost the performance again. Also updated the unittest for those datatypes.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40992
2009-03-18 17:09:26 +00:00
Pablo Hoffman
7b664a57b2 grammar adjustment
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40991
2009-03-18 00:47:07 +00:00
Daniel Grana
fb3a2dab2f core: allow returning None from spiders to support pipeline/download throttling
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40990
2009-03-17 21:30:33 +00:00
Ismael Carnales
c54a3fe7f6 removed item_to_dict
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40989
2009-03-17 15:22:57 +00:00
Ismael Carnales
63150e2ed4 added item_to_dict util function
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40988
2009-03-16 01:12:57 +00:00
Daniel Grana
f9e5915fd4 newitem: change itemadaptor tests to avoid false positives
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40987
2009-03-13 20:05:37 +00:00
Andres Moreira
cbf6d76c5d Improved PriorityQueue and PriorityStack with a patch sent by Federico Feroldi. The new implementation is about two times faster than the old.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40986
2009-03-12 19:06:57 +00:00
Ismael Carnales
518a7c87dd make field_adaptors non-public
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40985
2009-03-11 13:26:29 +00:00
Ismael Carnales
2ba017b14d add tests for adaptation of multivaluated fields
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40984
2009-03-11 13:01:26 +00:00
Daniel Grana
0be771152b stats: fix stats key typo
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40983
2009-03-11 03:01:30 +00:00
Pablo Hoffman
f8ce241484 fixed MailSender documentation example - closes #74
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40982
2009-03-10 19:56:22 +00:00
Ismael Carnales
ec1a6af04a added MultiValuedField to newitem
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40981
2009-03-10 19:35:45 +00:00
Ismael Carnales
2f75c2c93e removed unneeded import
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40980
2009-03-10 19:17:53 +00:00
Ismael Carnales
04baa7aed7 removed declarative from newitem (now using 'plain' metaclasses) added basic test for newitem
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40979
2009-03-10 18:40:26 +00:00
Ismael Carnales
774e7f623a fixed default adaptor inheritance, added test
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40978
2009-03-10 17:59:53 +00:00
Daniel Grana
67ab30a5ba core: add application/xhtml+xml support to responsetypes
more info at http://www.w3.org/TR/xhtml-media-types/#media-types

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40977
2009-03-10 17:51:03 +00:00
Ismael Carnales
63e6ce6e47 made IDENTITY a function of adaptor module
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40976
2009-03-10 16:57:27 +00:00
Ismael Carnales
698e6e73ea moved default_adaptor logic fully to the metaclass in ItemAdaptor
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40975
2009-03-10 16:34:48 +00:00
Ismael Carnales
06e049e551 move meta class initialization to metaclass in ItemAdaptor
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40974
2009-03-10 16:15:49 +00:00