1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-25 21:43:53 +00:00

1134 Commits

Author SHA1 Message Date
Pablo Hoffman
7e7823654c some code refactoring for the scrapy shell command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401034
2009-04-03 03:13:22 +00:00
Pablo Hoffman
b521ca4d36 massive improvements to xpath selectors doc. refs #25
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401033
2009-04-03 01:33:52 +00:00
Pablo Hoffman
d732a79ead added documentation for scrapy shell. closes #78
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401032
2009-04-03 01:19:36 +00:00
Pablo Hoffman
931c29eabc updated Scrapy architecture doc by adding reference to Scheduler middlewares. refs #31
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401031
2009-04-03 00:43:09 +00:00
Pablo Hoffman
8757359775 doc: added note about logging from spiders
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401030
2009-04-02 22:43:52 +00:00
Daniel Grana
1683c4e971 tests: comment failed test until fix
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401029
2009-04-01 14:29:56 +00:00
artem
6ac767f84b tests: added url_query_parameter() fail case
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401028
2009-04-01 13:40:30 +00:00
Daniel Grana
128de5a49b cluster: add missing import
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401027
2009-03-31 04:00:58 +00:00
Daniel Grana
e15c928ed2 cluster: add log gzipping support
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401026
2009-03-31 03:50:13 +00:00
Daniel Grana
c2852b13ea imagespipeline: restore original log message for uptodate image
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401025
2009-03-30 10:41:55 +00:00
Daniel Grana
96ee661f2c pipeline: add missing future import for with statement at images pipeline
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401024
2009-03-27 06:59:38 +00:00
Daniel Grana
186d0e93b0 Revert "added rule shorthand, for creating CrawlSpider rules"
This reverts commit r863.

This is not the place for this shorthand, and it hasn't the blessing of
BDFL. I look forward for the return of this shorthand.

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401023
2009-03-27 06:46:06 +00:00
Daniel Grana
7227b65de1 http: fix copy and failing appendlist method of Headers, also add missing tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401022
2009-03-27 06:42:24 +00:00
Daniel Grana
1d0e2d1202 linkextractors: add arg_to_iter support to RegexLinkExtractor
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401021
2009-03-27 06:05:54 +00:00
Daniel Grana
4fa7e6cac4 tests: fix images tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401020
2009-03-27 05:42:47 +00:00
Daniel Grana
5b7cfb3184 pipeline: refactor image pipelines moving common functionality to BaseImagesPipeline
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401019
2009-03-27 04:41:11 +00:00
Daniel Grana
70a686c02b http: add errback to Request constructor
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401018
2009-03-25 13:15:55 +00:00
Daniel Grana
828c3e0988 core: remove deferred_degenerator, instead use a cooperative map
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401017
2009-03-25 13:15:24 +00:00
Daniel Grana
9d59bb3153 utils: remove unused function arg_to_list
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401016
2009-03-25 13:14:50 +00:00
Daniel Grana
de0d57cd9b log: add an twisted.python.log.err handler wrapper just like log.msg one
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401015
2009-03-25 13:14:26 +00:00
Pablo Hoffman
7c4c17088a added copy tests for Headers which need to pass but are currently failing
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401014
2009-03-24 21:21:39 +00:00
Pablo Hoffman
d8c0440700 fixed bug in previous commit
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401013
2009-03-24 21:03:38 +00:00
Pablo Hoffman
0c9631594d - made Request.copy() use Request.replace()
- added callback, dont_filter, encoding to Request copy/replace
- fixed Request.replace() and Response.replace() which weren't working properly with empty arguments
- added new test cases
- added doc about copying requests and deferreds

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401012
2009-03-24 20:02:42 +00:00
Pablo Hoffman
03fc6fc477 made scrapy installation guide a bit more friendly for windows users
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401011
2009-03-22 22:05:23 +00:00
Pablo Hoffman
30a3df0c9d added documentation about --set and small improvement
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401010
2009-03-22 19:25:08 +00:00
Pablo Hoffman
cbe8f8ff45 moved spider loading below command line option processing, so options overrided by --set argument are taken into account at spider module-level code
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401009
2009-03-22 19:19:31 +00:00
Pablo Hoffman
28f947d2b9 added support for overriding settings from command line arguments, as suggested by Mark Ellul
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401008
2009-03-22 19:18:27 +00:00
Pablo Hoffman
482ec0c28c fixed bug in passing request callback arguments example code
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401007
2009-03-22 16:27:35 +00:00
Pablo Hoffman
f4bd8a48dc added doc about passing arguments to request callbacks, as suggested by tarasm
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401006
2009-03-22 16:24:56 +00:00
Pablo Hoffman
c476d586ea fixed typo in signals doc. closes #75
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401005
2009-03-22 16:22:57 +00:00
Daniel Grana
0b1184e472 redirectmw: allow dont_filter requests to continue redirecting but limit max redirections by global setting and per request
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401004
2009-03-20 19:38:32 +00:00
Daniel Grana
87e63216ef retrymw: remove fingerprint cache and use request meta instead, add more tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401003
2009-03-20 19:37:55 +00:00
Andres Moreira
cab6387c87 profiling: add new priorityqueue implementation (PQ5c) with cache support. fixed bug in run.py with the use of the with statement.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401002
2009-03-19 13:11:27 +00:00
Daniel Grana
bd1be0ac67 profiling: fix priority distribution and add new ones
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401001
2009-03-19 12:04:19 +00:00
Daniel Grana
2e63ef60f1 profiling: remove useless import
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401000
2009-03-19 05:42:23 +00:00
Daniel Grana
fdfa5302a4 profiling: minor fixes and new testcase for IndexError exception
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40999
2009-03-19 05:18:35 +00:00
Daniel Grana
9d1855a66e profiling: add alternative priorityqueue implementations and disable not useful ones from default runs
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40998
2009-03-19 04:09:24 +00:00
Daniel Grana
1c659ce615 profiling: add list+deque+cache+islice just to compare with pure slice
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40997
2009-03-19 03:33:06 +00:00
Daniel Grana
685807e3c4 profiling: improve list+deque+cache performance
results:

 == With 1 priorities (/tmp/pq-13882-1-50000) ==

 pushpops = 50000, times = 30
 heapq implementation: 12.6956150532
 dict+deque implementation: 5.3080239296
 deque+heapq implementation: 3.11057305336
 deque+defaultdict+deque implementation: 3.06583619118
 list+deque implementation: 4.85028195381
 list+deque+cache implementation: 4.86092495918

 == With 3 priorities (/tmp/pq-13882-3-50000) ==

 pushpops = 50000, times = 30
 heapq implementation: 13.9048631191
 dict+deque implementation: 6.526501894
 deque+heapq implementation: 9.95749187469
 deque+defaultdict+deque implementation: 4.94318699837
 list+deque implementation: 5.48832702637
 list+deque+cache implementation: 4.77395009995

 == With 5 priorities (/tmp/pq-13882-5-50000) ==

 pushpops = 50000, times = 30
 heapq implementation: 14.1862449646
 dict+deque implementation: 7.45535206795
 deque+heapq implementation: 11.7175529003
 deque+defaultdict+deque implementation: 5.40972518921
 list+deque implementation: 5.87488412857
 list+deque+cache implementation: 4.73579287529

 == With 10 priorities (/tmp/pq-13882-10-50000) ==

 pushpops = 50000, times = 30
 heapq implementation: 14.2052979469
 dict+deque implementation: 9.94834208488
 deque+heapq implementation: 13.0460109711
 deque+defaultdict+deque implementation: 5.79300785065
 list+deque implementation: 6.9981739521
 list+deque+cache implementation: 4.81988596916

 == With 100 priorities (/tmp/pq-13882-100-50000) ==

 pushpops = 50000, times = 30
 heapq implementation: 14.9574189186
 dict+deque implementation: 55.6348400116
 deque+heapq implementation: 14.9515259266
 deque+defaultdict+deque implementation: 10.6776599884
 list+deque implementation: 25.9212520123
 list+deque+cache implementation: 4.77596998215

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40996
2009-03-19 03:11:27 +00:00
Daniel Grana
44ba741277 profiling: automatically test include new PriorityQueue classes in testcases
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40995
2009-03-19 02:47:59 +00:00
Daniel Grana
e1c1865c2a profiling: add options to priority queue profiler
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40994
2009-03-19 02:25:20 +00:00
Daniel Grana
22c51638a8 profiling: add priorityqueue alternatives and profiling / test cases
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40993
2009-03-18 19:44:36 +00:00
Andres Moreira
2951e8425f Improved PriorityQueue and PriorityStack that boost the performance again. Also updated the unittest for those datatypes.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40992
2009-03-18 17:09:26 +00:00
Pablo Hoffman
7b664a57b2 grammar adjustment
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40991
2009-03-18 00:47:07 +00:00
Daniel Grana
fb3a2dab2f core: allow returning None from spiders to support pipeline/download throttling
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40990
2009-03-17 21:30:33 +00:00
Ismael Carnales
c54a3fe7f6 removed item_to_dict
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40989
2009-03-17 15:22:57 +00:00
Ismael Carnales
63150e2ed4 added item_to_dict util function
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40988
2009-03-16 01:12:57 +00:00
Daniel Grana
f9e5915fd4 newitem: change itemadaptor tests to avoid false positives
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40987
2009-03-13 20:05:37 +00:00
Andres Moreira
cbf6d76c5d Improved PriorityQueue and PriorityStack with a patch sent by Federico Feroldi. The new implementation is about two times faster than the old.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40986
2009-03-12 19:06:57 +00:00
Ismael Carnales
518a7c87dd make field_adaptors non-public
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40985
2009-03-11 13:26:29 +00:00