elpolilla
|
e030f38427
|
Fixed yet another bug in parse command that didnt extract links from all of the rules (just from the matching ones)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40467
|
2008-12-03 14:24:41 +00:00 |
|
elpolilla
|
242bd38b3f
|
Fixed another bug in parse command. Only links from matching rules were being extracted
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40466
|
2008-12-03 14:24:14 +00:00 |
|
elpolilla
|
cc2a699458
|
Fixed bug in parse command. Links from matching rules werent being extracted
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40465
|
2008-12-03 10:37:56 +00:00 |
|
Daniel Grana
|
f371c952dd
|
s3: convert images to RGB prior to save as JPEG
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40464
|
2008-12-02 18:10:24 +00:00 |
|
elpolilla
|
977e13e3b9
|
Fixed wrong help text in parse command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40463
|
2008-12-02 16:04:29 +00:00 |
|
elpolilla
|
ddda08ee51
|
Fixed some bugs in CrawlSpider
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40462
|
2008-12-02 15:19:15 +00:00 |
|
Daniel Grana
|
7ac8ae993a
|
Revert "add 505 and 403 to retry status codes due to amazon s3 random fails while uploading images"
This reverts changeset r457
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40461
|
2008-12-02 13:45:35 +00:00 |
|
elpolilla
|
068d2278a3
|
Removed silly bug from the previous revision
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40460
|
2008-12-02 13:29:20 +00:00 |
|
elpolilla
|
a4ba7b9d9a
|
Added stripping of urls in LinkExtractor, and removed unnecesary check
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40459
|
2008-12-02 13:26:04 +00:00 |
|
Daniel Grana
|
c22223beb7
|
add 505 and 403 to retry status codes due to amazon s3 random fails while uploading images
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40458
|
2008-12-02 13:01:49 +00:00 |
|
elpolilla
|
50379ee21e
|
Added LinkExtractor.matches test and fixed bug in that same method
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40457
|
2008-12-02 12:51:32 +00:00 |
|
elpolilla
|
c75ac38b92
|
Improved parse command and LinkExtractor's matches method
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40456
|
2008-12-02 12:20:55 +00:00 |
|
elpolilla
|
eff09f2f78
|
Added some missing postprocessing to items in parse command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40455
|
2008-12-02 02:17:15 +00:00 |
|
Daniel Grana
|
3b7f282bbd
|
fix missing import errors
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40454
|
2008-12-01 18:44:04 +00:00 |
|
Daniel Grana
|
dd37bb5a4c
|
MONKEYPATCH: in twisted < 8.0.0, HTTPPageGetter class can not handle HEAD requests because of response empty body raising PartialDownloadError.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40453
|
2008-12-01 17:14:06 +00:00 |
|
elpolilla
|
aa3ea811d1
|
Fixed wrong error message in parse command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40452
|
2008-12-01 15:49:04 +00:00 |
|
elpolilla
|
0cff67b25e
|
Finished implementing new parse command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40451
|
2008-12-01 15:44:50 +00:00 |
|
elpolilla
|
8b37282752
|
Changed CrawlSpider's way of setting crawling rules
--HG--
rename : scrapy/trunk/scrapy/contrib/spiders2/crawl.py => scrapy/trunk/scrapy/contrib/spiders/crawl.py
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40450
|
2008-12-01 02:59:34 +00:00 |
|
Daniel Grana
|
7f05b923ab
|
s3 images pipeline: dont include path prefix in key name
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40449
|
2008-11-28 16:18:03 +00:00 |
|
Pablo Hoffman
|
778b77f2bb
|
added parse2 command (candidate to replace parse command)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40448
|
2008-11-28 01:58:49 +00:00 |
|
Pablo Hoffman
|
6d5a130b96
|
more refactoring to CrawlSpider
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40447
|
2008-11-28 01:57:07 +00:00 |
|
elpolilla
|
1b92fef9a7
|
Added scrapy intro RST
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40446
|
2008-11-27 15:44:29 +00:00 |
|
Pablo Hoffman
|
142cd72fae
|
removed function check_valid_urlencode from scrapy.utils.url (it didn't fit there because it was too custom)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40445
|
2008-11-27 15:21:41 +00:00 |
|
Ismael Carnales
|
d865ca4e5b
|
removed unused and unneeded SITE_URL from localsettings
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40444
|
2008-11-27 14:54:42 +00:00 |
|
Ismael Carnales
|
2f5cff0b24
|
removed unneeded index view
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40443
|
2008-11-27 14:53:02 +00:00 |
|
Ismael Carnales
|
9cc3b2881f
|
updated requirements
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40442
|
2008-11-27 14:46:01 +00:00 |
|
Ismael Carnales
|
7683eefc0d
|
add empty docs/pickle dir for pickled docs
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40441
|
2008-11-27 14:43:36 +00:00 |
|
Ismael Carnales
|
dae37d3dff
|
don't store pickled docs in SVN only the dir
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40440
|
2008-11-27 14:42:42 +00:00 |
|
Ismael Carnales
|
9e43e69d13
|
added Makefile for building docs and testing
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40439
|
2008-11-27 14:39:55 +00:00 |
|
elpolilla
|
b714b5ab09
|
Added string cast in check_valid_urlencode to prevent urllib from failing
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40438
|
2008-11-27 11:21:28 +00:00 |
|
Pablo Hoffman
|
af842a3e6d
|
added proper fix to canonicalize_url problem with unicode urls already percent-encoded
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40437
|
2008-11-26 22:10:06 +00:00 |
|
samus_
|
8d08ab5f98
|
added excluded characters, also implemented dan's readability tips
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40436
|
2008-11-26 18:31:53 +00:00 |
|
Daniel Grana
|
874331a5b7
|
revert a testing less-than condition
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40435
|
2008-11-26 18:28:49 +00:00 |
|
Pablo Hoffman
|
ce172e8520
|
reverted wrong fix to canonicalize_url
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40434
|
2008-11-26 17:55:03 +00:00 |
|
elpolilla
|
0958eecfca
|
Modified canonicalize_url to make it always work with regular strings
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40433
|
2008-11-26 16:18:19 +00:00 |
|
elpolilla
|
8eab5802cd
|
Fixed bug in item.attribute
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40432
|
2008-11-26 16:09:39 +00:00 |
|
elpolilla
|
40b3f06f0b
|
Improved add parameter in item.attribute
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40431
|
2008-11-26 14:48:07 +00:00 |
|
elpolilla
|
26b4be78e3
|
Minor correction in docstring
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40430
|
2008-11-26 13:12:20 +00:00 |
|
elpolilla
|
04bd65de00
|
Added new crawling spider (with rules), but without replacing the previous one
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40429
|
2008-11-26 12:49:14 +00:00 |
|
samus_
|
fe49bc2011
|
improved check
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40428
|
2008-11-26 12:05:31 +00:00 |
|
Daniel Grana
|
537a3457cf
|
s3: put image size as metadata
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40427
|
2008-11-25 20:51:02 +00:00 |
|
samus_
|
664a72490f
|
improved url validation
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40426
|
2008-11-25 18:49:17 +00:00 |
|
Ismael Carnales
|
89bd6ede43
|
first commit of doc apps
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40425
|
2008-11-25 18:16:04 +00:00 |
|
Ismael Carnales
|
4241171947
|
style h1s
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40424
|
2008-11-25 17:42:10 +00:00 |
|
Daniel Grana
|
c61d0648e6
|
use hashlib instead of deprecated md5 module
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40423
|
2008-11-25 17:17:25 +00:00 |
|
Daniel Grana
|
81416c4dd1
|
s3: add image md5 content checksum to image path
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40422
|
2008-11-25 17:09:26 +00:00 |
|
Pablo Hoffman
|
33b20be7b5
|
cleanup handling of adaptor pipe debugging code
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40421
|
2008-11-25 17:06:36 +00:00 |
|
Ismael Carnales
|
e8df55d101
|
added header and footer templates
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40420
|
2008-11-25 16:23:37 +00:00 |
|
elpolilla
|
011adf4dd7
|
Turned "strip_list" adaptor into "strip"
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40419
|
2008-11-25 15:27:04 +00:00 |
|
Daniel Grana
|
1a60f6c1cb
|
comment default thumb sizes
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40418
|
2008-11-25 13:44:06 +00:00 |
|