1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-26 05:43:52 +00:00

1091 Commits

Author SHA1 Message Date
Pablo Hoffman
324ec07678 moved examples/contrib_exp to examples/experimental
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401041
2009-04-10 08:03:26 +00:00
Pablo Hoffman
317221d21c removed old .attribute() API from doc - it will be restored in the future when adaptors code get stable
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401040
2009-04-10 08:01:25 +00:00
Pablo Hoffman
46d9e681a9 minor updates to a couple of settings
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401039
2009-04-10 07:33:46 +00:00
Pablo Hoffman
7c66f1739d Several documentation changes:
- merged (and updated) new tutorial from proposed doc
- striped old tutorial and created new firebug topic
- added topic about useful third firefox add-ons
- rearranged main documentation index
- several assorted documentation fixes

--HG--
rename : scrapy/trunk/docs/proposed/tutorial.rst => scrapy/trunk/docs/intro/tutorial.rst
rename : scrapy/trunk/docs/intro/tutorial/scrot1.png => scrapy/trunk/docs/topics/_images/firebug1.png
rename : scrapy/trunk/docs/intro/tutorial/scrot2.png => scrapy/trunk/docs/topics/_images/firebug2.png
rename : scrapy/trunk/docs/intro/tutorial/scrot3.png => scrapy/trunk/docs/topics/_images/firebug3.png
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401038
2009-04-10 05:35:53 +00:00
Pablo Hoffman
0fb3ee16d8 more improvements to scrapy shell: added Request object, and support for modifying it and re-fetching it by issuing an empty 'get' command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401037
2009-04-03 04:13:21 +00:00
Pablo Hoffman
ef9f4961a1 scrapy.xpath: added docstring pointing to the doc
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401036
2009-04-03 04:05:57 +00:00
Pablo Hoffman
b7bd00b336 minor changes to some module descriptions
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401035
2009-04-03 03:20:45 +00:00
Pablo Hoffman
7e7823654c some code refactoring for the scrapy shell command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401034
2009-04-03 03:13:22 +00:00
Pablo Hoffman
b521ca4d36 massive improvements to xpath selectors doc. refs #25
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401033
2009-04-03 01:33:52 +00:00
Pablo Hoffman
d732a79ead added documentation for scrapy shell. closes #78
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401032
2009-04-03 01:19:36 +00:00
Pablo Hoffman
931c29eabc updated Scrapy architecture doc by adding reference to Scheduler middlewares. refs #31
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401031
2009-04-03 00:43:09 +00:00
Pablo Hoffman
8757359775 doc: added note about logging from spiders
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401030
2009-04-02 22:43:52 +00:00
Daniel Grana
1683c4e971 tests: comment failed test until fix
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401029
2009-04-01 14:29:56 +00:00
artem
6ac767f84b tests: added url_query_parameter() fail case
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401028
2009-04-01 13:40:30 +00:00
Daniel Grana
128de5a49b cluster: add missing import
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401027
2009-03-31 04:00:58 +00:00
Daniel Grana
e15c928ed2 cluster: add log gzipping support
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401026
2009-03-31 03:50:13 +00:00
Daniel Grana
c2852b13ea imagespipeline: restore original log message for uptodate image
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401025
2009-03-30 10:41:55 +00:00
Daniel Grana
96ee661f2c pipeline: add missing future import for with statement at images pipeline
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401024
2009-03-27 06:59:38 +00:00
Daniel Grana
186d0e93b0 Revert "added rule shorthand, for creating CrawlSpider rules"
This reverts commit r863.

This is not the place for this shorthand, and it hasn't the blessing of
BDFL. I look forward for the return of this shorthand.

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401023
2009-03-27 06:46:06 +00:00
Daniel Grana
7227b65de1 http: fix copy and failing appendlist method of Headers, also add missing tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401022
2009-03-27 06:42:24 +00:00
Daniel Grana
1d0e2d1202 linkextractors: add arg_to_iter support to RegexLinkExtractor
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401021
2009-03-27 06:05:54 +00:00
Daniel Grana
4fa7e6cac4 tests: fix images tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401020
2009-03-27 05:42:47 +00:00
Daniel Grana
5b7cfb3184 pipeline: refactor image pipelines moving common functionality to BaseImagesPipeline
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401019
2009-03-27 04:41:11 +00:00
Daniel Grana
70a686c02b http: add errback to Request constructor
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401018
2009-03-25 13:15:55 +00:00
Daniel Grana
828c3e0988 core: remove deferred_degenerator, instead use a cooperative map
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401017
2009-03-25 13:15:24 +00:00
Daniel Grana
9d59bb3153 utils: remove unused function arg_to_list
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401016
2009-03-25 13:14:50 +00:00
Daniel Grana
de0d57cd9b log: add an twisted.python.log.err handler wrapper just like log.msg one
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401015
2009-03-25 13:14:26 +00:00
Pablo Hoffman
7c4c17088a added copy tests for Headers which need to pass but are currently failing
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401014
2009-03-24 21:21:39 +00:00
Pablo Hoffman
d8c0440700 fixed bug in previous commit
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401013
2009-03-24 21:03:38 +00:00
Pablo Hoffman
0c9631594d - made Request.copy() use Request.replace()
- added callback, dont_filter, encoding to Request copy/replace
- fixed Request.replace() and Response.replace() which weren't working properly with empty arguments
- added new test cases
- added doc about copying requests and deferreds

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401012
2009-03-24 20:02:42 +00:00
Pablo Hoffman
03fc6fc477 made scrapy installation guide a bit more friendly for windows users
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401011
2009-03-22 22:05:23 +00:00
Pablo Hoffman
30a3df0c9d added documentation about --set and small improvement
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401010
2009-03-22 19:25:08 +00:00
Pablo Hoffman
cbe8f8ff45 moved spider loading below command line option processing, so options overrided by --set argument are taken into account at spider module-level code
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401009
2009-03-22 19:19:31 +00:00
Pablo Hoffman
28f947d2b9 added support for overriding settings from command line arguments, as suggested by Mark Ellul
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401008
2009-03-22 19:18:27 +00:00
Pablo Hoffman
482ec0c28c fixed bug in passing request callback arguments example code
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401007
2009-03-22 16:27:35 +00:00
Pablo Hoffman
f4bd8a48dc added doc about passing arguments to request callbacks, as suggested by tarasm
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401006
2009-03-22 16:24:56 +00:00
Pablo Hoffman
c476d586ea fixed typo in signals doc. closes #75
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401005
2009-03-22 16:22:57 +00:00
Daniel Grana
0b1184e472 redirectmw: allow dont_filter requests to continue redirecting but limit max redirections by global setting and per request
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401004
2009-03-20 19:38:32 +00:00
Daniel Grana
87e63216ef retrymw: remove fingerprint cache and use request meta instead, add more tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401003
2009-03-20 19:37:55 +00:00
Andres Moreira
cab6387c87 profiling: add new priorityqueue implementation (PQ5c) with cache support. fixed bug in run.py with the use of the with statement.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401002
2009-03-19 13:11:27 +00:00
Daniel Grana
bd1be0ac67 profiling: fix priority distribution and add new ones
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401001
2009-03-19 12:04:19 +00:00
Daniel Grana
2e63ef60f1 profiling: remove useless import
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%401000
2009-03-19 05:42:23 +00:00
Daniel Grana
fdfa5302a4 profiling: minor fixes and new testcase for IndexError exception
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40999
2009-03-19 05:18:35 +00:00
Daniel Grana
9d1855a66e profiling: add alternative priorityqueue implementations and disable not useful ones from default runs
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40998
2009-03-19 04:09:24 +00:00
Daniel Grana
1c659ce615 profiling: add list+deque+cache+islice just to compare with pure slice
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40997
2009-03-19 03:33:06 +00:00
Daniel Grana
685807e3c4 profiling: improve list+deque+cache performance
results:

 == With 1 priorities (/tmp/pq-13882-1-50000) ==

 pushpops = 50000, times = 30
 heapq implementation: 12.6956150532
 dict+deque implementation: 5.3080239296
 deque+heapq implementation: 3.11057305336
 deque+defaultdict+deque implementation: 3.06583619118
 list+deque implementation: 4.85028195381
 list+deque+cache implementation: 4.86092495918

 == With 3 priorities (/tmp/pq-13882-3-50000) ==

 pushpops = 50000, times = 30
 heapq implementation: 13.9048631191
 dict+deque implementation: 6.526501894
 deque+heapq implementation: 9.95749187469
 deque+defaultdict+deque implementation: 4.94318699837
 list+deque implementation: 5.48832702637
 list+deque+cache implementation: 4.77395009995

 == With 5 priorities (/tmp/pq-13882-5-50000) ==

 pushpops = 50000, times = 30
 heapq implementation: 14.1862449646
 dict+deque implementation: 7.45535206795
 deque+heapq implementation: 11.7175529003
 deque+defaultdict+deque implementation: 5.40972518921
 list+deque implementation: 5.87488412857
 list+deque+cache implementation: 4.73579287529

 == With 10 priorities (/tmp/pq-13882-10-50000) ==

 pushpops = 50000, times = 30
 heapq implementation: 14.2052979469
 dict+deque implementation: 9.94834208488
 deque+heapq implementation: 13.0460109711
 deque+defaultdict+deque implementation: 5.79300785065
 list+deque implementation: 6.9981739521
 list+deque+cache implementation: 4.81988596916

 == With 100 priorities (/tmp/pq-13882-100-50000) ==

 pushpops = 50000, times = 30
 heapq implementation: 14.9574189186
 dict+deque implementation: 55.6348400116
 deque+heapq implementation: 14.9515259266
 deque+defaultdict+deque implementation: 10.6776599884
 list+deque implementation: 25.9212520123
 list+deque+cache implementation: 4.77596998215

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40996
2009-03-19 03:11:27 +00:00
Daniel Grana
44ba741277 profiling: automatically test include new PriorityQueue classes in testcases
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40995
2009-03-19 02:47:59 +00:00
Daniel Grana
e1c1865c2a profiling: add options to priority queue profiler
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40994
2009-03-19 02:25:20 +00:00
Daniel Grana
22c51638a8 profiling: add priorityqueue alternatives and profiling / test cases
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40993
2009-03-18 19:44:36 +00:00
Andres Moreira
2951e8425f Improved PriorityQueue and PriorityStack that boost the performance again. Also updated the unittest for those datatypes.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40992
2009-03-18 17:09:26 +00:00