samus_
bd4c106a61
* adjusted RETRY_TIMES setting and behaviour
...
* refactored scrapy.contrib.downloadermiddleware.retry.RetryMiddleware
* created test for it (scrapy.tests.test_middleware_retry.RetryTest)
* updated docstring at scrapy.core.downloader.middleware.DownloaderMiddlewareManager
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40488
2008-12-12 21:26:00 +00:00
Ismael Carnales
f7fce6cdb9
changed formatting of srapy intro
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40487
2008-12-12 10:50:41 +00:00
elpolilla
7efa85ccc7
Added missing import and convenient check in ShoveItem pipeline
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40486
2008-12-11 13:54:54 +00:00
elpolilla
4ec69bf133
Added logging to the ShoveItem pipeline
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40485
2008-12-11 13:24:33 +00:00
elpolilla
991fabb783
Added response decompression to feed spiders
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40484
2008-12-10 16:48:02 +00:00
Pablo Hoffman
4d55a69536
removed obsolete dir
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40483
2008-12-10 01:39:32 +00:00
elpolilla
4c0a4222ec
Added some test cases to the ExtractImages adaptor
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40482
2008-12-08 16:33:11 +00:00
elpolilla
686ab9dbd1
Corrected error in CSVFeedSpiders template
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40481
2008-12-08 13:45:21 +00:00
elpolilla
677064964b
Added template for CSVFeedSpiders
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40480
2008-12-08 13:42:25 +00:00
elpolilla
528d239124
Fixed bug in images adaptor when a None value was received
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40479
2008-12-08 12:48:01 +00:00
Daniel Grana
29024d8a4c
Fix a bug introduced by splitting s3 image uploading to use another downloader slot
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40478
2008-12-08 12:00:30 +00:00
Daniel Grana
fade1730ce
Set Cache-Control header for Amazon S3 uploaded images to prevent
...
browsers from requesting images everytime.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40477
2008-12-08 12:00:02 +00:00
elpolilla
b52f7726f1
. Reverted change in r473. Now an error message is shown when no rules match the provided url
...
. Modified print_results to make it show the callback from which items/links were extracted
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40476
2008-12-08 11:15:38 +00:00
samus_
2fe8afdade
adding support for dicts (because this is what MySQLdb *really* uses)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40475
2008-12-07 02:56:07 +00:00
elpolilla
f373fc0f7c
Modified parse command to make it default to "parse" method if no rules are found
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40474
2008-12-05 17:56:36 +00:00
elpolilla
aeb95200a1
Modified parse command to accept multiple callbacks
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40473
2008-12-05 16:36:59 +00:00
Daniel Grana
aad2eae55d
S3 pipeline: add a custom method for requesting S3 requests downloads.
...
This helps split requests done to s3 from requests done to download
images from main spider domain.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40472
2008-12-04 18:52:03 +00:00
elpolilla
0ce417eb0b
Corrected some error messages in order to comply with the conventions
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40471
2008-12-04 16:48:11 +00:00
Daniel Grana
f18bce7062
core: allow engine download to be dinamically backouted to the double of the max_concurrent_requests for a domain
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40470
2008-12-04 02:10:56 +00:00
Daniel Grana
72c978d8a0
s3: sign requests in downloder middleware to avoid sending signed requests dated more than 15minutes ago
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40469
2008-12-04 00:38:56 +00:00
elpolilla
c7cb6c4e44
. Added --callback switch to crawl command
...
. Adding again the method 'parse_start_urls'
. Added --callback switch to parse command and removed extraction of links using rules
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40468
2008-12-03 17:41:03 +00:00
elpolilla
e030f38427
Fixed yet another bug in parse command that didnt extract links from all of the rules (just from the matching ones)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40467
2008-12-03 14:24:41 +00:00
elpolilla
242bd38b3f
Fixed another bug in parse command. Only links from matching rules were being extracted
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40466
2008-12-03 14:24:14 +00:00
elpolilla
cc2a699458
Fixed bug in parse command. Links from matching rules werent being extracted
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40465
2008-12-03 10:37:56 +00:00
Daniel Grana
f371c952dd
s3: convert images to RGB prior to save as JPEG
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40464
2008-12-02 18:10:24 +00:00
elpolilla
977e13e3b9
Fixed wrong help text in parse command
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40463
2008-12-02 16:04:29 +00:00
elpolilla
ddda08ee51
Fixed some bugs in CrawlSpider
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40462
2008-12-02 15:19:15 +00:00
Daniel Grana
7ac8ae993a
Revert "add 505 and 403 to retry status codes due to amazon s3 random fails while uploading images"
...
This reverts changeset r457
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40461
2008-12-02 13:45:35 +00:00
elpolilla
068d2278a3
Removed silly bug from the previous revision
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40460
2008-12-02 13:29:20 +00:00
elpolilla
a4ba7b9d9a
Added stripping of urls in LinkExtractor, and removed unnecesary check
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40459
2008-12-02 13:26:04 +00:00
Daniel Grana
c22223beb7
add 505 and 403 to retry status codes due to amazon s3 random fails while uploading images
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40458
2008-12-02 13:01:49 +00:00
elpolilla
50379ee21e
Added LinkExtractor.matches test and fixed bug in that same method
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40457
2008-12-02 12:51:32 +00:00
elpolilla
c75ac38b92
Improved parse command and LinkExtractor's matches method
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40456
2008-12-02 12:20:55 +00:00
elpolilla
eff09f2f78
Added some missing postprocessing to items in parse command
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40455
2008-12-02 02:17:15 +00:00
Daniel Grana
3b7f282bbd
fix missing import errors
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40454
2008-12-01 18:44:04 +00:00
Daniel Grana
dd37bb5a4c
MONKEYPATCH: in twisted < 8.0.0, HTTPPageGetter class can not handle HEAD requests because of response empty body raising PartialDownloadError.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40453
2008-12-01 17:14:06 +00:00
elpolilla
aa3ea811d1
Fixed wrong error message in parse command
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40452
2008-12-01 15:49:04 +00:00
elpolilla
0cff67b25e
Finished implementing new parse command
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40451
2008-12-01 15:44:50 +00:00
elpolilla
8b37282752
Changed CrawlSpider's way of setting crawling rules
...
--HG--
rename : scrapy/trunk/scrapy/contrib/spiders2/crawl.py => scrapy/trunk/scrapy/contrib/spiders/crawl.py
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40450
2008-12-01 02:59:34 +00:00
Daniel Grana
7f05b923ab
s3 images pipeline: dont include path prefix in key name
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40449
2008-11-28 16:18:03 +00:00
Pablo Hoffman
778b77f2bb
added parse2 command (candidate to replace parse command)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40448
2008-11-28 01:58:49 +00:00
Pablo Hoffman
6d5a130b96
more refactoring to CrawlSpider
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40447
2008-11-28 01:57:07 +00:00
elpolilla
1b92fef9a7
Added scrapy intro RST
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40446
2008-11-27 15:44:29 +00:00
Pablo Hoffman
142cd72fae
removed function check_valid_urlencode from scrapy.utils.url (it didn't fit there because it was too custom)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40445
2008-11-27 15:21:41 +00:00
Ismael Carnales
d865ca4e5b
removed unused and unneeded SITE_URL from localsettings
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40444
2008-11-27 14:54:42 +00:00
Ismael Carnales
2f5cff0b24
removed unneeded index view
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40443
2008-11-27 14:53:02 +00:00
Ismael Carnales
9cc3b2881f
updated requirements
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40442
2008-11-27 14:46:01 +00:00
Ismael Carnales
7683eefc0d
add empty docs/pickle dir for pickled docs
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40441
2008-11-27 14:43:36 +00:00
Ismael Carnales
dae37d3dff
don't store pickled docs in SVN only the dir
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40440
2008-11-27 14:42:42 +00:00
Ismael Carnales
9e43e69d13
added Makefile for building docs and testing
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40439
2008-11-27 14:39:55 +00:00