Daniel Grana
93c313853d
use request.deferred for cached valued too
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40250
2008-09-17 12:34:48 +00:00
Daniel Grana
6f2d36199b
media: use deferred of request to allow adding callback from get_media_requests
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40249
2008-09-16 23:40:48 +00:00
Daniel Grana
7932a8a5b6
add image pipeline that stores and can thumbs images in different sizes
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40248
2008-09-16 23:40:06 +00:00
Daniel Grana
4807f2eb82
move media_to_download hook to allow caching and prevent downloading
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40247
2008-09-16 23:39:26 +00:00
Daniel Grana
5048491f8d
mediapipeline: change cache variable by domaininfo
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40246
2008-09-16 19:31:25 +00:00
Daniel Grana
41a4f22c27
messy typo adding errback
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40245
2008-09-16 14:03:39 +00:00
anibal
e0d90f9d87
Scrapy architecture diagram 1st version, dia source
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40244
2008-09-16 11:56:15 +00:00
Pablo Hoffman
0a7a5e5b3d
added docs dir, guess for what? :)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40243
2008-09-16 11:30:22 +00:00
Pablo Hoffman
b57148dfad
renamed scrapy.lib to scrapy.xlib
...
--HG--
rename : scrapy/trunk/scrapy/lib/BeautifulSoup.py => scrapy/trunk/scrapy/xlib/BeautifulSoup.py
rename : scrapy/trunk/scrapy/lib/__init__.py => scrapy/trunk/scrapy/xlib/__init__.py
rename : scrapy/trunk/scrapy/lib/lrucache.py => scrapy/trunk/scrapy/xlib/lrucache.py
rename : scrapy/trunk/scrapy/lib/lsprofcalltree.py => scrapy/trunk/scrapy/xlib/lsprofcalltree.py
rename : scrapy/trunk/scrapy/lib/pydispatch/__init__.py => scrapy/trunk/scrapy/xlib/pydispatch/__init__.py
rename : scrapy/trunk/scrapy/lib/pydispatch/dispatcher.py => scrapy/trunk/scrapy/xlib/pydispatch/dispatcher.py
rename : scrapy/trunk/scrapy/lib/pydispatch/errors.py => scrapy/trunk/scrapy/xlib/pydispatch/errors.py
rename : scrapy/trunk/scrapy/lib/pydispatch/license.txt => scrapy/trunk/scrapy/xlib/pydispatch/license.txt
rename : scrapy/trunk/scrapy/lib/pydispatch/robust.py => scrapy/trunk/scrapy/xlib/pydispatch/robust.py
rename : scrapy/trunk/scrapy/lib/pydispatch/robustapply.py => scrapy/trunk/scrapy/xlib/pydispatch/robustapply.py
rename : scrapy/trunk/scrapy/lib/pydispatch/saferef.py => scrapy/trunk/scrapy/xlib/pydispatch/saferef.py
rename : scrapy/trunk/scrapy/lib/spidermonkey/INSTALL.scrapy => scrapy/trunk/scrapy/xlib/spidermonkey/INSTALL.scrapy
rename : scrapy/trunk/scrapy/lib/spidermonkey/__init__.py => scrapy/trunk/scrapy/xlib/spidermonkey/__init__.py
rename : scrapy/trunk/scrapy/lib/spidermonkey/sm_settings.py => scrapy/trunk/scrapy/xlib/spidermonkey/sm_settings.py
rename : scrapy/trunk/scrapy/lib/spidermonkey/spidermonkey.py => scrapy/trunk/scrapy/xlib/spidermonkey/spidermonkey.py
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40242
2008-09-15 15:37:44 +00:00
Pablo Hoffman
2cb300b926
removed simplejson from scrapy code
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40241
2008-09-15 15:29:12 +00:00
Daniel Grana
08feb4df1c
ername methods and enforce extraction of Request objects as item media to download
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40240
2008-09-15 13:55:23 +00:00
Pablo Hoffman
0090fcb0f5
improved fix to support IE7 (thanks bill\!)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40239
2008-09-15 13:08:44 +00:00
Pablo Hoffman
d7a88e8d77
fixed hack for IE6
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40238
2008-09-15 12:58:19 +00:00
Pablo Hoffman
31d3b2cf5a
added hack for IE6
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40237
2008-09-15 12:56:31 +00:00
Daniel Grana
4944762341
move bugtraps to avoid silence of bugs in media_downloaded or media_failed
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40236
2008-09-15 12:46:34 +00:00
Daniel Grana
ca695c8c79
small bugfix and add bugtraps
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40235
2008-09-15 08:57:51 +00:00
Daniel Grana
6407a744e3
remove item.image_urls from get_urls_from_item
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40234
2008-09-15 08:27:15 +00:00
Daniel Grana
af0808be95
add info object to public methods
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40233
2008-09-15 07:08:27 +00:00
Daniel Grana
4263c3a2e1
Adds docstrings to public methods, remove unused imports, normalize urls to requests.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40232
2008-09-15 06:01:40 +00:00
Daniel Grana
31e497e5eb
add media_to_download hook support, and return cached result if available
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40231
2008-09-15 04:33:40 +00:00
olveyra
df2f4b21b8
removed reference to guid (decobot specific) attribute in item
...
pipeline logs
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40230
2008-09-15 01:23:10 +00:00
Daniel Grana
3803f67645
mediamiddleware: bugfixes and remove prints
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40229
2008-09-12 21:55:14 +00:00
olveyra
2fd5cf644e
- Support for not output in cluster status
...
- schedule option by default does not have output
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40228
2008-09-12 20:48:48 +00:00
Daniel Grana
d612701b66
add media pipeline skeleton
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40227
2008-09-12 19:58:45 +00:00
Andres Moreira
2dc20bb56e
Fixed condition of new option --quiet.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40226
2008-09-11 16:35:50 +00:00
olveyra
4cffc039ff
- leave in adaptors.py only the elemental adaptors pipeline code.
...
- moved to contrib/adaptorpipeline the advanced adaptors pipeline code.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40225
2008-09-11 15:33:35 +00:00
Andres Moreira
23814537b3
Fixed bug in the total items count. Added new option --quiet to the action diff. This option doesn't show the report if there aren't differences.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40224
2008-09-11 14:59:16 +00:00
elpolilla
2b6bcd61df
Small piece of code moved for not complying with conventions
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40223
2008-09-11 14:50:15 +00:00
elpolilla
06964cf36b
some small semantics errors fixed in xpath test
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40222
2008-09-11 12:13:33 +00:00
elpolilla
d8b7f41ee5
implemented extract_unquote method over xpath selectors
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40221
2008-09-11 11:14:44 +00:00
olveyra
da7b87962a
- AdaptorPipe compilation feature to resolve the problem of efficience
...
and mantain the simplicity of the actual pipeline definition approach (for
people that prefer to bore with lots of lines to define and edit a
pipeline, they can edit directly the compiled version)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40220
2008-09-10 14:06:28 +00:00
elpolilla
789d8d3531
bugfix in decompressor tool
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40219
2008-09-10 02:49:05 +00:00
olveyra
ec710702ce
Added getd command, similar to get but first decompress the response.
...
Based on decompressor tool.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40218
2008-09-09 15:05:42 +00:00
elpolilla
543b9de973
pylinted decompressor
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40217
2008-09-09 14:12:57 +00:00
elpolilla
b363f9085d
decompressor tool improved
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40216
2008-09-09 14:10:15 +00:00
anibal
98ac66f821
Added more flexibility to childclasses who override gespider command
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40215
2008-09-09 14:10:00 +00:00
elpolilla
eddc7b7608
added sample data for decompression tool's test
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40214
2008-09-09 13:02:49 +00:00
elpolilla
7df63477bf
added response decompression tool
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40213
2008-09-09 12:56:01 +00:00
olveyra
3a2981801c
minor code fix
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40212
2008-09-08 12:06:53 +00:00
olveyra
3fe36e77d2
Removed PRIORITY constants, added DEFAULT_PRIORITY setting
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40211
2008-09-06 17:03:47 +00:00
olveyra
8b09be8601
adaptors with generic matching function
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40210
2008-09-05 21:51:53 +00:00
Andres Moreira
09b11a3a7b
Added an histogram plot to simpages group to the report. Added quantities of html elements to symbol of the simhash. Now is more effective.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40209
2008-09-05 14:45:24 +00:00
olveyra
c2af3124ba
- added negative attribute name match
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40208
2008-09-04 19:50:25 +00:00
olveyra
4e61f05761
second security fix
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40207
2008-09-04 19:33:36 +00:00
olveyra
51bdd6944b
- removed contrib/adaptors.py
...
- display adaptor name when an exception raises inside the adaptor
pipeline
- add debug parameter to item attribute method to display input/output
of each adaptor
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40206
2008-09-04 19:11:53 +00:00
olveyra
daf51203e3
security fix
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40205
2008-09-04 18:15:45 +00:00
Andres Moreira
f4f9626c3f
Remove old code.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40204
2008-09-03 19:11:52 +00:00
Andres Moreira
3cb1ab8794
Add rule engine to the framework. Rules are executed in a pipeline.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40203
2008-09-03 19:06:34 +00:00
olveyra
1fa947dd67
- Improved attribute name checks
...
- added support to tuple definition of pipeline
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40202
2008-09-03 13:53:28 +00:00
olveyra
82a9fa9ffc
more efficient name attribute check in adaptors pipeline
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40201
2008-09-02 18:52:04 +00:00