elpolilla
125414c15a
Added some basic tests for the ImagePipeline (although there are a few missing yet)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40317
2008-10-16 14:42:51 +00:00
elpolilla
3433272d71
- Added adaptors tests
...
- Fixed some small bugs on a few adaptors
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40316
2008-10-16 13:29:23 +00:00
elpolilla
0ed5978abf
Improved adaptors functionality:
...
- Added many basic adaptors (like extract, extract_links, regex, etc.)
- Added some basic pipelines (for single data, lists, urls, etc.)
- Now XPathSelectors store the response with which they were created (if any)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40315
2008-10-13 14:30:51 +00:00
olveyra
e6f73c3dfa
removed rulengine and simpage code
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40314
2008-10-09 18:59:32 +00:00
Pablo Hoffman
e9f3913328
changed TEST_DB setting to TEST_SCRAPING_DB
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40313
2008-10-07 14:31:24 +00:00
Daniel Grana
498cd3a356
oops, missing import
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40312
2008-10-07 13:07:36 +00:00
Daniel Grana
0ae4a19e08
remove hardcoded user/password for testing storedb
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40311
2008-10-07 13:01:32 +00:00
elpolilla
fc782d5e4b
Modified ItemDeltas to work with RobustScrapedItems instead of ScrapedItems
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40310
2008-10-06 17:12:33 +00:00
Andres Moreira
05f4a26cca
Fixed bug for unicode support.The empty string ('') in some platforms is decoding as ascii, independently of the default encoding of python, changed to u''.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40309
2008-10-06 11:58:11 +00:00
elpolilla
a309dfeb7d
Added ItemDelta objects and modified replays to make use of them
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40308
2008-10-06 10:14:52 +00:00
Pablo Hoffman
88597e3a77
removed unneeded DEFAULT_DATA_ENCODING and commented COMMANDS_MODULE in project template
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40307
2008-10-06 03:23:28 +00:00
Pablo Hoffman
d6cbaab65b
added setadaptors method to ScrapedItem, removed incorrect constructor
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40306
2008-10-06 03:22:05 +00:00
Pablo Hoffman
9c19316d21
removed unused imports and minor bug fix to media/image pipeline
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40305
2008-10-06 02:36:39 +00:00
samus_
c74364809e
removing utf-16 xpathselector_iternodes testcase since the problem comes from UnicodeDammit's conversion meaning that the results of the test are misleading
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40304
2008-10-05 18:32:39 +00:00
Pablo Hoffman
8f80603acd
added RegexLinkExtractor in new scrapy.link.extractors module
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40303
2008-10-05 07:57:51 +00:00
Pablo Hoffman
c68c478ac3
added scrapy.contrib.spiders module
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40302
2008-10-05 07:45:03 +00:00
Pablo Hoffman
7b2317d935
added scrapy.utils.response module
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40301
2008-10-05 07:39:26 +00:00
Pablo Hoffman
24df5d1602
improved scrapy-admin.py script and default project template
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40300
2008-10-05 07:37:21 +00:00
Pablo Hoffman
8dc46c16ce
removed old comment
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40299
2008-10-05 07:35:58 +00:00
Pablo Hoffman
acebe63d0a
added __nonzero__ method to XPathSelector
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40298
2008-10-05 06:58:39 +00:00
Damian Canabal
ac4688dabb
added NotSupported Exception
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40297
2008-10-03 19:50:26 +00:00
Andres Moreira
768a31a483
Added test for utils/markup.py. Added support to unicode to the new markup functions. Changed some comments.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40296
2008-10-03 14:37:25 +00:00
Andres Moreira
453714b252
Added new functions to parse html.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40295
2008-10-03 11:57:51 +00:00
Pablo Hoffman
71c3cea112
added another test for safe_url_string function
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40294
2008-10-03 04:14:07 +00:00
olveyra
e447044fb9
reverted an experimental code that should have been commited
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40293
2008-10-02 22:41:51 +00:00
olveyra
1f4e484a5c
added DOWNLOAD_DELAY comment in settings template
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40292
2008-10-02 20:39:36 +00:00
olveyra
58a419f45c
added support for global DOWNLOAD_DELAY setting
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40291
2008-10-02 19:59:51 +00:00
olveyra
3ec47301e3
allow to directly specify which domain corresponds to a given request
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40290
2008-10-02 19:22:34 +00:00
Pablo Hoffman
98f3314bed
simplified test without loosing functionality
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40289
2008-09-30 19:57:05 +00:00
olveyra
9ba4573b13
added test for r284
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40288
2008-09-30 19:24:24 +00:00
olveyra
c755f5a535
better management of some redirection loops.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40287
2008-09-30 16:56:12 +00:00
olveyra
433f35b417
small fix
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40286
2008-09-30 16:19:20 +00:00
olveyra
a38c72bb05
safe_url_string should not escape unreserved marks (see RFC 2396, sec
...
2.3)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40285
2008-09-30 14:15:23 +00:00
olveyra
3f32bae45f
- copied original request to response.request in get_url method
...
- deleted unused comment
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40284
2008-09-30 13:44:11 +00:00
elpolilla
3ae4d62f38
- Added the posibility of knowing the decompressed response's format, in the decompression tool
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40283
2008-09-29 18:14:28 +00:00
Damian Canabal
f8f2f3a542
rolled back public ent_re to private and added a function has_entities instead
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40282
2008-09-29 13:21:35 +00:00
Damian Canabal
ad00d5e632
changed private html entity regex to public
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40281
2008-09-29 12:52:54 +00:00
samus_
c07937c4df
added comment
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40280
2008-09-25 12:49:19 +00:00
samus_
c4aa1e7e8b
small fix to the regex
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40279
2008-09-25 12:45:43 +00:00
Daniel Grana
555cb0940b
images: use brief exception logging
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40278
2008-09-25 12:22:31 +00:00
olveyra
c73cbdd9b1
allow to override item class adaptor in constructor
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40277
2008-09-25 02:14:02 +00:00
samus_
79485ceb5c
added support for xml-declared encodings
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40276
2008-09-24 18:19:19 +00:00
Damian Canabal
a4b709e03f
added test for remove entities
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40275
2008-09-24 12:40:55 +00:00
Damian Canabal
4e7e539b03
html entities regexp improvement
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40274
2008-09-24 12:25:10 +00:00
Pablo Hoffman
79be2ca97d
moved scrapy.core.log module to scrapy.log
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40273
2008-09-23 22:18:20 +00:00
Pablo Hoffman
fbec2b3a43
moved scrapy.core.mail module to scrapy.mail
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40272
2008-09-23 21:52:05 +00:00
Pablo Hoffman
cb8e0d9bdd
added scrapy.utils.defer, moved deferred functions from scrapy.utils.misc to that module
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40271
2008-09-23 21:48:48 +00:00
Pablo Hoffman
6637d8cf68
removed location_str function incorrectly added to this module
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40270
2008-09-23 21:34:05 +00:00
samus_
dbfc6341c2
removed duplicated function convert_entity from scrapy.utils.misc
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40269
2008-09-23 20:36:32 +00:00
german
4842fe0679
removed unquote_html
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40268
2008-09-22 18:04:30 +00:00