Pablo Hoffman
7e7823654c
some code refactoring for the scrapy shell command
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401034
2009-04-03 03:13:22 +00:00
Pablo Hoffman
b521ca4d36
massive improvements to xpath selectors doc. refs #25
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401033
2009-04-03 01:33:52 +00:00
Pablo Hoffman
d732a79ead
added documentation for scrapy shell. closes #78
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401032
2009-04-03 01:19:36 +00:00
Pablo Hoffman
931c29eabc
updated Scrapy architecture doc by adding reference to Scheduler middlewares. refs #31
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401031
2009-04-03 00:43:09 +00:00
Pablo Hoffman
8757359775
doc: added note about logging from spiders
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401030
2009-04-02 22:43:52 +00:00
Daniel Grana
1683c4e971
tests: comment failed test until fix
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401029
2009-04-01 14:29:56 +00:00
artem
6ac767f84b
tests: added url_query_parameter() fail case
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401028
2009-04-01 13:40:30 +00:00
Daniel Grana
128de5a49b
cluster: add missing import
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401027
2009-03-31 04:00:58 +00:00
Daniel Grana
e15c928ed2
cluster: add log gzipping support
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401026
2009-03-31 03:50:13 +00:00
Daniel Grana
c2852b13ea
imagespipeline: restore original log message for uptodate image
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401025
2009-03-30 10:41:55 +00:00
Daniel Grana
96ee661f2c
pipeline: add missing future import for with statement at images pipeline
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401024
2009-03-27 06:59:38 +00:00
Daniel Grana
186d0e93b0
Revert "added rule shorthand, for creating CrawlSpider rules"
...
This reverts commit r863.
This is not the place for this shorthand, and it hasn't the blessing of
BDFL. I look forward for the return of this shorthand.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401023
2009-03-27 06:46:06 +00:00
Daniel Grana
7227b65de1
http: fix copy and failing appendlist method of Headers, also add missing tests
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401022
2009-03-27 06:42:24 +00:00
Daniel Grana
1d0e2d1202
linkextractors: add arg_to_iter support to RegexLinkExtractor
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401021
2009-03-27 06:05:54 +00:00
Daniel Grana
4fa7e6cac4
tests: fix images tests
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401020
2009-03-27 05:42:47 +00:00
Daniel Grana
5b7cfb3184
pipeline: refactor image pipelines moving common functionality to BaseImagesPipeline
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401019
2009-03-27 04:41:11 +00:00
Daniel Grana
70a686c02b
http: add errback to Request constructor
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401018
2009-03-25 13:15:55 +00:00
Daniel Grana
828c3e0988
core: remove deferred_degenerator, instead use a cooperative map
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401017
2009-03-25 13:15:24 +00:00
Daniel Grana
9d59bb3153
utils: remove unused function arg_to_list
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401016
2009-03-25 13:14:50 +00:00
Daniel Grana
de0d57cd9b
log: add an twisted.python.log.err handler wrapper just like log.msg one
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401015
2009-03-25 13:14:26 +00:00
Pablo Hoffman
7c4c17088a
added copy tests for Headers which need to pass but are currently failing
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401014
2009-03-24 21:21:39 +00:00
Pablo Hoffman
d8c0440700
fixed bug in previous commit
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401013
2009-03-24 21:03:38 +00:00
Pablo Hoffman
0c9631594d
- made Request.copy() use Request.replace()
...
- added callback, dont_filter, encoding to Request copy/replace
- fixed Request.replace() and Response.replace() which weren't working properly with empty arguments
- added new test cases
- added doc about copying requests and deferreds
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401012
2009-03-24 20:02:42 +00:00
Pablo Hoffman
03fc6fc477
made scrapy installation guide a bit more friendly for windows users
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401011
2009-03-22 22:05:23 +00:00
Pablo Hoffman
30a3df0c9d
added documentation about --set and small improvement
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401010
2009-03-22 19:25:08 +00:00
Pablo Hoffman
cbe8f8ff45
moved spider loading below command line option processing, so options overrided by --set argument are taken into account at spider module-level code
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401009
2009-03-22 19:19:31 +00:00
Pablo Hoffman
28f947d2b9
added support for overriding settings from command line arguments, as suggested by Mark Ellul
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401008
2009-03-22 19:18:27 +00:00
Pablo Hoffman
482ec0c28c
fixed bug in passing request callback arguments example code
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401007
2009-03-22 16:27:35 +00:00
Pablo Hoffman
f4bd8a48dc
added doc about passing arguments to request callbacks, as suggested by tarasm
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401006
2009-03-22 16:24:56 +00:00
Pablo Hoffman
c476d586ea
fixed typo in signals doc. closes #75
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401005
2009-03-22 16:22:57 +00:00
Daniel Grana
0b1184e472
redirectmw: allow dont_filter requests to continue redirecting but limit max redirections by global setting and per request
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401004
2009-03-20 19:38:32 +00:00
Daniel Grana
87e63216ef
retrymw: remove fingerprint cache and use request meta instead, add more tests
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401003
2009-03-20 19:37:55 +00:00
Andres Moreira
cab6387c87
profiling: add new priorityqueue implementation (PQ5c) with cache support. fixed bug in run.py with the use of the with statement.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401002
2009-03-19 13:11:27 +00:00
Daniel Grana
bd1be0ac67
profiling: fix priority distribution and add new ones
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401001
2009-03-19 12:04:19 +00:00
Daniel Grana
2e63ef60f1
profiling: remove useless import
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401000
2009-03-19 05:42:23 +00:00
Daniel Grana
fdfa5302a4
profiling: minor fixes and new testcase for IndexError exception
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40999
2009-03-19 05:18:35 +00:00
Daniel Grana
9d1855a66e
profiling: add alternative priorityqueue implementations and disable not useful ones from default runs
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40998
2009-03-19 04:09:24 +00:00
Daniel Grana
1c659ce615
profiling: add list+deque+cache+islice just to compare with pure slice
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40997
2009-03-19 03:33:06 +00:00
Daniel Grana
685807e3c4
profiling: improve list+deque+cache performance
...
results:
== With 1 priorities (/tmp/pq-13882-1-50000) ==
pushpops = 50000, times = 30
heapq implementation: 12.6956150532
dict+deque implementation: 5.3080239296
deque+heapq implementation: 3.11057305336
deque+defaultdict+deque implementation: 3.06583619118
list+deque implementation: 4.85028195381
list+deque+cache implementation: 4.86092495918
== With 3 priorities (/tmp/pq-13882-3-50000) ==
pushpops = 50000, times = 30
heapq implementation: 13.9048631191
dict+deque implementation: 6.526501894
deque+heapq implementation: 9.95749187469
deque+defaultdict+deque implementation: 4.94318699837
list+deque implementation: 5.48832702637
list+deque+cache implementation: 4.77395009995
== With 5 priorities (/tmp/pq-13882-5-50000) ==
pushpops = 50000, times = 30
heapq implementation: 14.1862449646
dict+deque implementation: 7.45535206795
deque+heapq implementation: 11.7175529003
deque+defaultdict+deque implementation: 5.40972518921
list+deque implementation: 5.87488412857
list+deque+cache implementation: 4.73579287529
== With 10 priorities (/tmp/pq-13882-10-50000) ==
pushpops = 50000, times = 30
heapq implementation: 14.2052979469
dict+deque implementation: 9.94834208488
deque+heapq implementation: 13.0460109711
deque+defaultdict+deque implementation: 5.79300785065
list+deque implementation: 6.9981739521
list+deque+cache implementation: 4.81988596916
== With 100 priorities (/tmp/pq-13882-100-50000) ==
pushpops = 50000, times = 30
heapq implementation: 14.9574189186
dict+deque implementation: 55.6348400116
deque+heapq implementation: 14.9515259266
deque+defaultdict+deque implementation: 10.6776599884
list+deque implementation: 25.9212520123
list+deque+cache implementation: 4.77596998215
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40996
2009-03-19 03:11:27 +00:00
Daniel Grana
44ba741277
profiling: automatically test include new PriorityQueue classes in testcases
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40995
2009-03-19 02:47:59 +00:00
Daniel Grana
e1c1865c2a
profiling: add options to priority queue profiler
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40994
2009-03-19 02:25:20 +00:00
Daniel Grana
22c51638a8
profiling: add priorityqueue alternatives and profiling / test cases
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40993
2009-03-18 19:44:36 +00:00
Andres Moreira
2951e8425f
Improved PriorityQueue and PriorityStack that boost the performance again. Also updated the unittest for those datatypes.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40992
2009-03-18 17:09:26 +00:00
Pablo Hoffman
7b664a57b2
grammar adjustment
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40991
2009-03-18 00:47:07 +00:00
Daniel Grana
fb3a2dab2f
core: allow returning None from spiders to support pipeline/download throttling
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40990
2009-03-17 21:30:33 +00:00
Ismael Carnales
c54a3fe7f6
removed item_to_dict
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40989
2009-03-17 15:22:57 +00:00
Ismael Carnales
63150e2ed4
added item_to_dict util function
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40988
2009-03-16 01:12:57 +00:00
Daniel Grana
f9e5915fd4
newitem: change itemadaptor tests to avoid false positives
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40987
2009-03-13 20:05:37 +00:00
Andres Moreira
cbf6d76c5d
Improved PriorityQueue and PriorityStack with a patch sent by Federico Feroldi. The new implementation is about two times faster than the old.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40986
2009-03-12 19:06:57 +00:00
Ismael Carnales
518a7c87dd
make field_adaptors non-public
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40985
2009-03-11 13:26:29 +00:00