1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-27 04:44:26 +00:00

510 Commits

Author SHA1 Message Date
Ismael Carnales
3cbfb554d2 home must be the fist link in a left-to-right lecture world
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40410
2008-11-24 12:51:22 +00:00
Ismael Carnales
859733d8ab added link to insophia page
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40409
2008-11-24 12:48:45 +00:00
Ismael Carnales
3dab0c12a0 use relative links for images
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40408
2008-11-24 12:40:59 +00:00
elpolilla
e9e8fa00d9 Added check to the previous commit, because it could have failed
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40407
2008-11-24 12:38:41 +00:00
elpolilla
8f01a70ee7 Removed ugly printing of adaptors dict in parse command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40406
2008-11-24 12:35:26 +00:00
elpolilla
82cf911192 Added method for inserting adaptors into a pipeline
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40405
2008-11-24 12:05:23 +00:00
elpolilla
0f9eba8a97 Removed AdaptorDict and added AdaptorPipe, plus the adaptize helper
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40404
2008-11-24 11:28:18 +00:00
Daniel Grana
cbffced594 add amamzon ws helper module
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40403
2008-11-24 10:59:59 +00:00
Daniel Grana
2142d4f7e0 s3 image pipeline: use scrapyengine for downloads
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40402
2008-11-24 10:19:26 +00:00
Daniel Grana
c4434590b0 images pipeline factorization
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40401
2008-11-24 10:18:55 +00:00
Daniel Grana
98bc8ec3e1 spidermiddleware: remove not used "filter" hook
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40400
2008-11-24 10:18:27 +00:00
Daniel Grana
b56a5d933e clean spidermiddleware callback/errback handling
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40399
2008-11-24 10:17:56 +00:00
Daniel Grana
c7b3fae97a add basic spider template
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40398
2008-11-24 10:17:32 +00:00
Daniel Grana
0938984de0 media pipeline: no need for complex deferred execution
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40397
2008-11-24 10:17:02 +00:00
Daniel Grana
a6ff165332 media pipeline: dont hide list of results for item_complete
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40396
2008-11-24 10:16:33 +00:00
Pablo Hoffman
5ccd4df0cd moved log method to BaseSpider, moved adapt_feed doc to docstring
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40395
2008-11-23 01:32:21 +00:00
samus_
d4040d73d9 added docstring
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40394
2008-11-22 19:20:53 +00:00
Daniel Grana
1458686cbf media pipeline: support deferred execution of media_to_download hook
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40393
2008-11-21 10:37:32 +00:00
elpolilla
c15a24c25e Added --force option switch to genspider command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40392
2008-11-21 10:08:22 +00:00
samus_
836c532ef1 some cleanup to the identify mechanism
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40391
2008-11-20 15:36:48 +00:00
elpolilla
17d45f5216 Added CSVFeedSpider
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40390
2008-11-20 14:34:59 +00:00
elpolilla
77a917df6b Modified FOLLOW_LINKS setting name to a more appropiate one
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40389
2008-11-20 13:10:06 +00:00
elpolilla
99c4c06380 Added --nofollow option switch to crawl command, and its functionality into CrawlSpiders
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40388
2008-11-20 12:10:36 +00:00
elpolilla
90fe6ef1b8 Fixed bug that didnt show extracted links in new spiders parse command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40387
2008-11-19 12:29:04 +00:00
Daniel Grana
56e3d5ae66 Revert "Modified image pipeline to make it able to manage 304 responses"
This reverts commit 3ff5303de7a3d2cffce0244b5bd9aa2b0f5c9c83.

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40386
2008-11-18 17:37:25 +00:00
elpolilla
d5469eabb2 Modified image pipeline to make it able to manage 304 responses
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40385
2008-11-18 15:55:27 +00:00
elpolilla
9a7b5a7cbf Improved genspider command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40384
2008-11-18 11:34:03 +00:00
Daniel Grana
862def2d73 move teplate renderer function to utils package, it can be reused by genspider command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40383
2008-11-17 10:33:49 +00:00
Daniel Grana
d6d32fa672 substitute scrapy_settings template variables on startproject
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40382
2008-11-17 10:33:15 +00:00
elpolilla
bddfbee0d4 Fixed some unicode issues in adaptors, improved docstrings and a test
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40381
2008-11-14 13:36:18 +00:00
elpolilla
c209f214a8 -Fixed bug in ExtractImages adaptor that made it fail if it received a string
-Removed BasicSpider and it's guid generation method because it wasnt generic enough to be in the framework

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40380
2008-11-14 12:25:56 +00:00
elpolilla
f5eb71fb69 Fixed bug in url_query_cleaner that returned wrong parameters for urls with fragments and added test
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40379
2008-11-14 01:32:49 +00:00
Pablo Hoffman
79c9ac388d updated scrapy.utils.misc.hash_values to drop usage of deprecated sha module in favor or hashlib, added tests to hash_values function
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40378
2008-11-13 15:50:27 +00:00
elpolilla
01b157d10b Reverted delist adaptor
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40377
2008-11-13 12:27:36 +00:00
elpolilla
5f4053d9fc Reverted to revision 370
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40376
2008-11-13 12:20:32 +00:00
elpolilla
fab0e6938c Removed unnecesary check in RobustScrapedItem's getattr
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40375
2008-11-13 11:01:44 +00:00
elpolilla
4afa29b26a Fixed adaptors test
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40374
2008-11-13 10:46:43 +00:00
elpolilla
4ef7c46303 Added ItemAttribute objects
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40373
2008-11-13 10:35:24 +00:00
elpolilla
90d28badf6 Turned unquote adaptor into a function, and added adaptor factory
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40372
2008-11-13 10:13:11 +00:00
elpolilla
36080ada86 Added some missing docstrings to adaptors
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40371
2008-11-13 10:04:04 +00:00
elpolilla
323241b6d6 Reverted changes in r368
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40370
2008-11-11 12:51:39 +00:00
elpolilla
c5d134053b Added possibility of checking whether a response should or shouldnt be parsed with the corresponding parse_suffix method (by defining check_suffix) in CrawlSpider
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40369
2008-11-10 15:30:45 +00:00
elpolilla
6db6b8c52e GUID wasnt being set when calling a spider's parse_url method (from the parse command). Fixed that.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40368
2008-11-10 12:26:11 +00:00
samus_
22d713bba9 seems scrapy uses a test spider (scrapy.tests.test_spiders.testplugin.TestSpider) moving the test to decobot
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40367
2008-11-08 13:34:02 +00:00
samus_
3e88890bc3 added check for valid code on spiders
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40366
2008-11-08 12:26:54 +00:00
Damian Canabal
5e54ac52ff fixed new_response_from_xpaths function, unicode string was passed as body instead of ResponseBody object
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40365
2008-11-04 17:45:26 +00:00
elpolilla
a2461fbeea Wrong attribute assignation fixed again
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40364
2008-11-04 14:06:28 +00:00
elpolilla
224d3c5185 Bugfix in parse command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40363
2008-11-04 13:27:31 +00:00
elpolilla
5bad79836f Fixed bad checking while setting attributes
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40362
2008-11-04 11:26:23 +00:00
elpolilla
bd38a312d4 - Fixed bug in attributes assignation (empty attributes being set)
- Added GUID setting to FeedSpider

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40361
2008-11-04 10:57:59 +00:00