1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-27 17:24:12 +00:00

515 Commits

Author SHA1 Message Date
Ismael Carnales
89de849ed1 updated install link
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40515
2008-12-16 16:17:29 +00:00
Ismael Carnales
6766a39628 added view to serve images
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40514
2008-12-16 16:14:15 +00:00
Ismael Carnales
aab5a4a7dd added links and reformated some texts
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40513
2008-12-16 15:53:01 +00:00
Ismael Carnales
e140b1e435 completed the install doc
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40512
2008-12-16 15:40:27 +00:00
Ismael Carnales
a026e382ed renamed intro to overview
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40511
2008-12-16 15:02:32 +00:00
Ismael Carnales
e1ea75baac archived old documentation
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40510
2008-12-16 15:01:26 +00:00
Ismael Carnales
99f956c166 making dir for old docs
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40509
2008-12-16 15:00:35 +00:00
Ismael Carnales
de003d15cd renamed tutorial2 to tutorial
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40508
2008-12-16 14:59:38 +00:00
Ismael Carnales
7f5117e337 reworking docs structure
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40507
2008-12-16 14:58:52 +00:00
Ismael Carnales
3e1eea8da0 updated install with info for Arch Linux
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40506
2008-12-16 13:16:08 +00:00
Ismael Carnales
4b2d29a6ce renamed .static to _static
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40505
2008-12-16 11:58:24 +00:00
Ismael Carnales
26c749b396 added extension for settings and basic skeleton of reference/settings
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40504
2008-12-16 11:53:20 +00:00
elpolilla
8d4a1e66e8 Modified the new scrapy tutorial and added some images
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40503
2008-12-16 11:46:19 +00:00
elpolilla
14b2a4ff81 Modified Scrapy tutorial
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40502
2008-12-16 00:39:48 +00:00
samus_
f9931518aa restoring decompressor while errors are fixed
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40501
2008-12-15 17:08:57 +00:00
elpolilla
95c00c99b0 Corrected syntax errors in templates
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40500
2008-12-15 15:06:52 +00:00
Ismael Carnales
e5a476f95e fixed footer menu
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40499
2008-12-15 14:30:25 +00:00
elpolilla
2e01fa1667 Added part of the new scrapy tutorial
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40498
2008-12-15 13:44:42 +00:00
elpolilla
1cf569f88a Reverting last change in scrapy-ctl script. PYTHONPATH should be set manually
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40497
2008-12-15 12:14:41 +00:00
Pablo Hoffman
d72056c91e removed incorrect text
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40496
2008-12-15 12:02:16 +00:00
elpolilla
2d0cca538e . Added path detection to scrapy-ctl script, which was giving problems with new projects
. Corrected typo in CrawlSpiders template

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40495
2008-12-15 11:58:57 +00:00
elpolilla
6b0e1f090e . Replaced Decompressor tool by Decompression mmiddleware, including its tests
. Removed references to Decompressor tool in FeedSpiders

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40494
2008-12-15 02:29:47 +00:00
Pablo Hoffman
dcc340234d scrapy.org: updated Home, added Download & Home articles. Updated footer & header
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40493
2008-12-14 06:29:19 +00:00
Pablo Hoffman
becf3bd7aa added BSD license
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40492
2008-12-14 05:23:47 +00:00
Pablo Hoffman
4599893f35 fixed bug in DownloaderStats middleware
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40491
2008-12-14 04:32:01 +00:00
Pablo Hoffman
1510efcac9 Sorted out downloader-related stats:
* added DownloaderStats (downloader middleware) to handle all downloader-related stats
 * removed downloader related stats from CoreStats
 * removed downloader exception stats from RetryMiddleware

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40490
2008-12-14 04:01:54 +00:00
samus_
3d16160bf7 adding RetryMiddleware errors to the stats
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40489
2008-12-14 01:49:59 +00:00
samus_
bd4c106a61 * adjusted RETRY_TIMES setting and behaviour
* refactored scrapy.contrib.downloadermiddleware.retry.RetryMiddleware
* created test for it (scrapy.tests.test_middleware_retry.RetryTest)
* updated docstring at scrapy.core.downloader.middleware.DownloaderMiddlewareManager

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40488
2008-12-12 21:26:00 +00:00
Ismael Carnales
f7fce6cdb9 changed formatting of srapy intro
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40487
2008-12-12 10:50:41 +00:00
elpolilla
7efa85ccc7 Added missing import and convenient check in ShoveItem pipeline
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40486
2008-12-11 13:54:54 +00:00
elpolilla
4ec69bf133 Added logging to the ShoveItem pipeline
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40485
2008-12-11 13:24:33 +00:00
elpolilla
991fabb783 Added response decompression to feed spiders
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40484
2008-12-10 16:48:02 +00:00
Pablo Hoffman
4d55a69536 removed obsolete dir
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40483
2008-12-10 01:39:32 +00:00
elpolilla
4c0a4222ec Added some test cases to the ExtractImages adaptor
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40482
2008-12-08 16:33:11 +00:00
elpolilla
686ab9dbd1 Corrected error in CSVFeedSpiders template
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40481
2008-12-08 13:45:21 +00:00
elpolilla
677064964b Added template for CSVFeedSpiders
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40480
2008-12-08 13:42:25 +00:00
elpolilla
528d239124 Fixed bug in images adaptor when a None value was received
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40479
2008-12-08 12:48:01 +00:00
Daniel Grana
29024d8a4c Fix a bug introduced by splitting s3 image uploading to use another downloader slot
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40478
2008-12-08 12:00:30 +00:00
Daniel Grana
fade1730ce Set Cache-Control header for Amazon S3 uploaded images to prevent
browsers from requesting images everytime.

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40477
2008-12-08 12:00:02 +00:00
elpolilla
b52f7726f1 . Reverted change in r473. Now an error message is shown when no rules match the provided url
. Modified print_results to make it show the callback from which items/links were extracted

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40476
2008-12-08 11:15:38 +00:00
samus_
2fe8afdade adding support for dicts (because this is what MySQLdb *really* uses)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40475
2008-12-07 02:56:07 +00:00
elpolilla
f373fc0f7c Modified parse command to make it default to "parse" method if no rules are found
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40474
2008-12-05 17:56:36 +00:00
elpolilla
aeb95200a1 Modified parse command to accept multiple callbacks
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40473
2008-12-05 16:36:59 +00:00
Daniel Grana
aad2eae55d S3 pipeline: add a custom method for requesting S3 requests downloads.
This helps split requests done to s3 from requests done to download
images from main spider domain.

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40472
2008-12-04 18:52:03 +00:00
elpolilla
0ce417eb0b Corrected some error messages in order to comply with the conventions
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40471
2008-12-04 16:48:11 +00:00
Daniel Grana
f18bce7062 core: allow engine download to be dinamically backouted to the double of the max_concurrent_requests for a domain
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40470
2008-12-04 02:10:56 +00:00
Daniel Grana
72c978d8a0 s3: sign requests in downloder middleware to avoid sending signed requests dated more than 15minutes ago
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40469
2008-12-04 00:38:56 +00:00
elpolilla
c7cb6c4e44 . Added --callback switch to crawl command
. Adding again the method 'parse_start_urls'
. Added --callback switch to parse command and removed extraction of links using rules

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40468
2008-12-03 17:41:03 +00:00
elpolilla
e030f38427 Fixed yet another bug in parse command that didnt extract links from all of the rules (just from the matching ones)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40467
2008-12-03 14:24:41 +00:00
elpolilla
242bd38b3f Fixed another bug in parse command. Only links from matching rules were being extracted
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40466
2008-12-03 14:24:14 +00:00