elpolilla
|
0958eecfca
|
Modified canonicalize_url to make it always work with regular strings
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40433
|
2008-11-26 16:18:19 +00:00 |
|
elpolilla
|
8eab5802cd
|
Fixed bug in item.attribute
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40432
|
2008-11-26 16:09:39 +00:00 |
|
elpolilla
|
40b3f06f0b
|
Improved add parameter in item.attribute
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40431
|
2008-11-26 14:48:07 +00:00 |
|
elpolilla
|
26b4be78e3
|
Minor correction in docstring
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40430
|
2008-11-26 13:12:20 +00:00 |
|
elpolilla
|
04bd65de00
|
Added new crawling spider (with rules), but without replacing the previous one
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40429
|
2008-11-26 12:49:14 +00:00 |
|
samus_
|
fe49bc2011
|
improved check
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40428
|
2008-11-26 12:05:31 +00:00 |
|
Daniel Grana
|
537a3457cf
|
s3: put image size as metadata
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40427
|
2008-11-25 20:51:02 +00:00 |
|
samus_
|
664a72490f
|
improved url validation
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40426
|
2008-11-25 18:49:17 +00:00 |
|
Ismael Carnales
|
89bd6ede43
|
first commit of doc apps
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40425
|
2008-11-25 18:16:04 +00:00 |
|
Ismael Carnales
|
4241171947
|
style h1s
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40424
|
2008-11-25 17:42:10 +00:00 |
|
Daniel Grana
|
c61d0648e6
|
use hashlib instead of deprecated md5 module
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40423
|
2008-11-25 17:17:25 +00:00 |
|
Daniel Grana
|
81416c4dd1
|
s3: add image md5 content checksum to image path
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40422
|
2008-11-25 17:09:26 +00:00 |
|
Pablo Hoffman
|
33b20be7b5
|
cleanup handling of adaptor pipe debugging code
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40421
|
2008-11-25 17:06:36 +00:00 |
|
Ismael Carnales
|
e8df55d101
|
added header and footer templates
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40420
|
2008-11-25 16:23:37 +00:00 |
|
elpolilla
|
011adf4dd7
|
Turned "strip_list" adaptor into "strip"
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40419
|
2008-11-25 15:27:04 +00:00 |
|
Daniel Grana
|
1a60f6c1cb
|
comment default thumb sizes
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40418
|
2008-11-25 13:44:06 +00:00 |
|
Daniel Grana
|
f61be3f078
|
fix syntax typo
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40417
|
2008-11-25 11:58:31 +00:00 |
|
Daniel Grana
|
072fed8d4a
|
aws: request signing function and tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40416
|
2008-11-25 11:06:34 +00:00 |
|
Pablo Hoffman
|
0fbb7579f6
|
added new CrawlSpider that's gonna replace the old CrawlSpider soon
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40415
|
2008-11-25 02:38:26 +00:00 |
|
elpolilla
|
f558a49d2c
|
Fixed ExtractImages adaptor, which wasnt looking for a base tag
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40414
|
2008-11-25 01:41:22 +00:00 |
|
Pablo Hoffman
|
5028146e8c
|
added __slots__ to decrease memory footprint of Link objects
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40413
|
2008-11-25 00:39:11 +00:00 |
|
elpolilla
|
a8903d5877
|
Changed deletion of _adaptor_dict attribute in parse command that could trigger exceptions
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40412
|
2008-11-24 18:31:05 +00:00 |
|
elpolilla
|
68e497e532
|
Fixed some little bugs in adaptors
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40411
|
2008-11-24 13:09:08 +00:00 |
|
Ismael Carnales
|
3cbfb554d2
|
home must be the fist link in a left-to-right lecture world
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40410
|
2008-11-24 12:51:22 +00:00 |
|
Ismael Carnales
|
859733d8ab
|
added link to insophia page
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40409
|
2008-11-24 12:48:45 +00:00 |
|
Ismael Carnales
|
3dab0c12a0
|
use relative links for images
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40408
|
2008-11-24 12:40:59 +00:00 |
|
elpolilla
|
e9e8fa00d9
|
Added check to the previous commit, because it could have failed
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40407
|
2008-11-24 12:38:41 +00:00 |
|
elpolilla
|
8f01a70ee7
|
Removed ugly printing of adaptors dict in parse command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40406
|
2008-11-24 12:35:26 +00:00 |
|
elpolilla
|
82cf911192
|
Added method for inserting adaptors into a pipeline
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40405
|
2008-11-24 12:05:23 +00:00 |
|
elpolilla
|
0f9eba8a97
|
Removed AdaptorDict and added AdaptorPipe, plus the adaptize helper
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40404
|
2008-11-24 11:28:18 +00:00 |
|
Daniel Grana
|
cbffced594
|
add amamzon ws helper module
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40403
|
2008-11-24 10:59:59 +00:00 |
|
Daniel Grana
|
2142d4f7e0
|
s3 image pipeline: use scrapyengine for downloads
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40402
|
2008-11-24 10:19:26 +00:00 |
|
Daniel Grana
|
c4434590b0
|
images pipeline factorization
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40401
|
2008-11-24 10:18:55 +00:00 |
|
Daniel Grana
|
98bc8ec3e1
|
spidermiddleware: remove not used "filter" hook
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40400
|
2008-11-24 10:18:27 +00:00 |
|
Daniel Grana
|
b56a5d933e
|
clean spidermiddleware callback/errback handling
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40399
|
2008-11-24 10:17:56 +00:00 |
|
Daniel Grana
|
c7b3fae97a
|
add basic spider template
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40398
|
2008-11-24 10:17:32 +00:00 |
|
Daniel Grana
|
0938984de0
|
media pipeline: no need for complex deferred execution
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40397
|
2008-11-24 10:17:02 +00:00 |
|
Daniel Grana
|
a6ff165332
|
media pipeline: dont hide list of results for item_complete
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40396
|
2008-11-24 10:16:33 +00:00 |
|
Pablo Hoffman
|
5ccd4df0cd
|
moved log method to BaseSpider, moved adapt_feed doc to docstring
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40395
|
2008-11-23 01:32:21 +00:00 |
|
samus_
|
d4040d73d9
|
added docstring
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40394
|
2008-11-22 19:20:53 +00:00 |
|
Daniel Grana
|
1458686cbf
|
media pipeline: support deferred execution of media_to_download hook
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40393
|
2008-11-21 10:37:32 +00:00 |
|
elpolilla
|
c15a24c25e
|
Added --force option switch to genspider command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40392
|
2008-11-21 10:08:22 +00:00 |
|
samus_
|
836c532ef1
|
some cleanup to the identify mechanism
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40391
|
2008-11-20 15:36:48 +00:00 |
|
elpolilla
|
17d45f5216
|
Added CSVFeedSpider
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40390
|
2008-11-20 14:34:59 +00:00 |
|
elpolilla
|
77a917df6b
|
Modified FOLLOW_LINKS setting name to a more appropiate one
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40389
|
2008-11-20 13:10:06 +00:00 |
|
elpolilla
|
99c4c06380
|
Added --nofollow option switch to crawl command, and its functionality into CrawlSpiders
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40388
|
2008-11-20 12:10:36 +00:00 |
|
elpolilla
|
90fe6ef1b8
|
Fixed bug that didnt show extracted links in new spiders parse command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40387
|
2008-11-19 12:29:04 +00:00 |
|
Daniel Grana
|
56e3d5ae66
|
Revert "Modified image pipeline to make it able to manage 304 responses"
This reverts commit 3ff5303de7a3d2cffce0244b5bd9aa2b0f5c9c83.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40386
|
2008-11-18 17:37:25 +00:00 |
|
elpolilla
|
d5469eabb2
|
Modified image pipeline to make it able to manage 304 responses
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40385
|
2008-11-18 15:55:27 +00:00 |
|
elpolilla
|
9a7b5a7cbf
|
Improved genspider command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40384
|
2008-11-18 11:34:03 +00:00 |
|