1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-27 08:43:43 +00:00

454 Commits

Author SHA1 Message Date
Daniel Grana
3b7f282bbd fix missing import errors
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40454
2008-12-01 18:44:04 +00:00
Daniel Grana
dd37bb5a4c MONKEYPATCH: in twisted < 8.0.0, HTTPPageGetter class can not handle HEAD requests because of response empty body raising PartialDownloadError.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40453
2008-12-01 17:14:06 +00:00
elpolilla
aa3ea811d1 Fixed wrong error message in parse command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40452
2008-12-01 15:49:04 +00:00
elpolilla
0cff67b25e Finished implementing new parse command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40451
2008-12-01 15:44:50 +00:00
elpolilla
8b37282752 Changed CrawlSpider's way of setting crawling rules
--HG--
rename : scrapy/trunk/scrapy/contrib/spiders2/crawl.py => scrapy/trunk/scrapy/contrib/spiders/crawl.py
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40450
2008-12-01 02:59:34 +00:00
Daniel Grana
7f05b923ab s3 images pipeline: dont include path prefix in key name
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40449
2008-11-28 16:18:03 +00:00
Pablo Hoffman
778b77f2bb added parse2 command (candidate to replace parse command)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40448
2008-11-28 01:58:49 +00:00
Pablo Hoffman
6d5a130b96 more refactoring to CrawlSpider
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40447
2008-11-28 01:57:07 +00:00
elpolilla
1b92fef9a7 Added scrapy intro RST
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40446
2008-11-27 15:44:29 +00:00
Pablo Hoffman
142cd72fae removed function check_valid_urlencode from scrapy.utils.url (it didn't fit there because it was too custom)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40445
2008-11-27 15:21:41 +00:00
Ismael Carnales
d865ca4e5b removed unused and unneeded SITE_URL from localsettings
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40444
2008-11-27 14:54:42 +00:00
Ismael Carnales
2f5cff0b24 removed unneeded index view
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40443
2008-11-27 14:53:02 +00:00
Ismael Carnales
9cc3b2881f updated requirements
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40442
2008-11-27 14:46:01 +00:00
Ismael Carnales
7683eefc0d add empty docs/pickle dir for pickled docs
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40441
2008-11-27 14:43:36 +00:00
Ismael Carnales
dae37d3dff don't store pickled docs in SVN only the dir
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40440
2008-11-27 14:42:42 +00:00
Ismael Carnales
9e43e69d13 added Makefile for building docs and testing
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40439
2008-11-27 14:39:55 +00:00
elpolilla
b714b5ab09 Added string cast in check_valid_urlencode to prevent urllib from failing
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40438
2008-11-27 11:21:28 +00:00
Pablo Hoffman
af842a3e6d added proper fix to canonicalize_url problem with unicode urls already percent-encoded
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40437
2008-11-26 22:10:06 +00:00
samus_
8d08ab5f98 added excluded characters, also implemented dan's readability tips
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40436
2008-11-26 18:31:53 +00:00
Daniel Grana
874331a5b7 revert a testing less-than condition
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40435
2008-11-26 18:28:49 +00:00
Pablo Hoffman
ce172e8520 reverted wrong fix to canonicalize_url
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40434
2008-11-26 17:55:03 +00:00
elpolilla
0958eecfca Modified canonicalize_url to make it always work with regular strings
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40433
2008-11-26 16:18:19 +00:00
elpolilla
8eab5802cd Fixed bug in item.attribute
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40432
2008-11-26 16:09:39 +00:00
elpolilla
40b3f06f0b Improved add parameter in item.attribute
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40431
2008-11-26 14:48:07 +00:00
elpolilla
26b4be78e3 Minor correction in docstring
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40430
2008-11-26 13:12:20 +00:00
elpolilla
04bd65de00 Added new crawling spider (with rules), but without replacing the previous one
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40429
2008-11-26 12:49:14 +00:00
samus_
fe49bc2011 improved check
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40428
2008-11-26 12:05:31 +00:00
Daniel Grana
537a3457cf s3: put image size as metadata
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40427
2008-11-25 20:51:02 +00:00
samus_
664a72490f improved url validation
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40426
2008-11-25 18:49:17 +00:00
Ismael Carnales
89bd6ede43 first commit of doc apps
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40425
2008-11-25 18:16:04 +00:00
Ismael Carnales
4241171947 style h1s
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40424
2008-11-25 17:42:10 +00:00
Daniel Grana
c61d0648e6 use hashlib instead of deprecated md5 module
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40423
2008-11-25 17:17:25 +00:00
Daniel Grana
81416c4dd1 s3: add image md5 content checksum to image path
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40422
2008-11-25 17:09:26 +00:00
Pablo Hoffman
33b20be7b5 cleanup handling of adaptor pipe debugging code
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40421
2008-11-25 17:06:36 +00:00
Ismael Carnales
e8df55d101 added header and footer templates
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40420
2008-11-25 16:23:37 +00:00
elpolilla
011adf4dd7 Turned "strip_list" adaptor into "strip"
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40419
2008-11-25 15:27:04 +00:00
Daniel Grana
1a60f6c1cb comment default thumb sizes
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40418
2008-11-25 13:44:06 +00:00
Daniel Grana
f61be3f078 fix syntax typo
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40417
2008-11-25 11:58:31 +00:00
Daniel Grana
072fed8d4a aws: request signing function and tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40416
2008-11-25 11:06:34 +00:00
Pablo Hoffman
0fbb7579f6 added new CrawlSpider that's gonna replace the old CrawlSpider soon
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40415
2008-11-25 02:38:26 +00:00
elpolilla
f558a49d2c Fixed ExtractImages adaptor, which wasnt looking for a base tag
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40414
2008-11-25 01:41:22 +00:00
Pablo Hoffman
5028146e8c added __slots__ to decrease memory footprint of Link objects
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40413
2008-11-25 00:39:11 +00:00
elpolilla
a8903d5877 Changed deletion of _adaptor_dict attribute in parse command that could trigger exceptions
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40412
2008-11-24 18:31:05 +00:00
elpolilla
68e497e532 Fixed some little bugs in adaptors
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40411
2008-11-24 13:09:08 +00:00
Ismael Carnales
3cbfb554d2 home must be the fist link in a left-to-right lecture world
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40410
2008-11-24 12:51:22 +00:00
Ismael Carnales
859733d8ab added link to insophia page
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40409
2008-11-24 12:48:45 +00:00
Ismael Carnales
3dab0c12a0 use relative links for images
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40408
2008-11-24 12:40:59 +00:00
elpolilla
e9e8fa00d9 Added check to the previous commit, because it could have failed
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40407
2008-11-24 12:38:41 +00:00
elpolilla
8f01a70ee7 Removed ugly printing of adaptors dict in parse command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40406
2008-11-24 12:35:26 +00:00
elpolilla
82cf911192 Added method for inserting adaptors into a pipeline
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40405
2008-11-24 12:05:23 +00:00