1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-21 05:33:16 +00:00

7850 Commits

Author SHA1 Message Date
Daniel Grana
93c313853d use request.deferred for cached valued too
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40250
2008-09-17 12:34:48 +00:00
Daniel Grana
6f2d36199b media: use deferred of request to allow adding callback from get_media_requests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40249
2008-09-16 23:40:48 +00:00
Daniel Grana
7932a8a5b6 add image pipeline that stores and can thumbs images in different sizes
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40248
2008-09-16 23:40:06 +00:00
Daniel Grana
4807f2eb82 move media_to_download hook to allow caching and prevent downloading
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40247
2008-09-16 23:39:26 +00:00
Daniel Grana
5048491f8d mediapipeline: change cache variable by domaininfo
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40246
2008-09-16 19:31:25 +00:00
Daniel Grana
41a4f22c27 messy typo adding errback
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40245
2008-09-16 14:03:39 +00:00
anibal
e0d90f9d87 Scrapy architecture diagram 1st version, dia source
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40244
2008-09-16 11:56:15 +00:00
Pablo Hoffman
0a7a5e5b3d added docs dir, guess for what? :)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40243
2008-09-16 11:30:22 +00:00
Pablo Hoffman
b57148dfad renamed scrapy.lib to scrapy.xlib
--HG--
rename : scrapy/trunk/scrapy/lib/BeautifulSoup.py => scrapy/trunk/scrapy/xlib/BeautifulSoup.py
rename : scrapy/trunk/scrapy/lib/__init__.py => scrapy/trunk/scrapy/xlib/__init__.py
rename : scrapy/trunk/scrapy/lib/lrucache.py => scrapy/trunk/scrapy/xlib/lrucache.py
rename : scrapy/trunk/scrapy/lib/lsprofcalltree.py => scrapy/trunk/scrapy/xlib/lsprofcalltree.py
rename : scrapy/trunk/scrapy/lib/pydispatch/__init__.py => scrapy/trunk/scrapy/xlib/pydispatch/__init__.py
rename : scrapy/trunk/scrapy/lib/pydispatch/dispatcher.py => scrapy/trunk/scrapy/xlib/pydispatch/dispatcher.py
rename : scrapy/trunk/scrapy/lib/pydispatch/errors.py => scrapy/trunk/scrapy/xlib/pydispatch/errors.py
rename : scrapy/trunk/scrapy/lib/pydispatch/license.txt => scrapy/trunk/scrapy/xlib/pydispatch/license.txt
rename : scrapy/trunk/scrapy/lib/pydispatch/robust.py => scrapy/trunk/scrapy/xlib/pydispatch/robust.py
rename : scrapy/trunk/scrapy/lib/pydispatch/robustapply.py => scrapy/trunk/scrapy/xlib/pydispatch/robustapply.py
rename : scrapy/trunk/scrapy/lib/pydispatch/saferef.py => scrapy/trunk/scrapy/xlib/pydispatch/saferef.py
rename : scrapy/trunk/scrapy/lib/spidermonkey/INSTALL.scrapy => scrapy/trunk/scrapy/xlib/spidermonkey/INSTALL.scrapy
rename : scrapy/trunk/scrapy/lib/spidermonkey/__init__.py => scrapy/trunk/scrapy/xlib/spidermonkey/__init__.py
rename : scrapy/trunk/scrapy/lib/spidermonkey/sm_settings.py => scrapy/trunk/scrapy/xlib/spidermonkey/sm_settings.py
rename : scrapy/trunk/scrapy/lib/spidermonkey/spidermonkey.py => scrapy/trunk/scrapy/xlib/spidermonkey/spidermonkey.py
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40242
2008-09-15 15:37:44 +00:00
Pablo Hoffman
2cb300b926 removed simplejson from scrapy code
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40241
2008-09-15 15:29:12 +00:00
Daniel Grana
08feb4df1c ername methods and enforce extraction of Request objects as item media to download
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40240
2008-09-15 13:55:23 +00:00
Pablo Hoffman
0090fcb0f5 improved fix to support IE7 (thanks bill\!)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40239
2008-09-15 13:08:44 +00:00
Pablo Hoffman
d7a88e8d77 fixed hack for IE6
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40238
2008-09-15 12:58:19 +00:00
Pablo Hoffman
31d3b2cf5a added hack for IE6
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40237
2008-09-15 12:56:31 +00:00
Daniel Grana
4944762341 move bugtraps to avoid silence of bugs in media_downloaded or media_failed
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40236
2008-09-15 12:46:34 +00:00
Daniel Grana
ca695c8c79 small bugfix and add bugtraps
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40235
2008-09-15 08:57:51 +00:00
Daniel Grana
6407a744e3 remove item.image_urls from get_urls_from_item
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40234
2008-09-15 08:27:15 +00:00
Daniel Grana
af0808be95 add info object to public methods
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40233
2008-09-15 07:08:27 +00:00
Daniel Grana
4263c3a2e1 Adds docstrings to public methods, remove unused imports, normalize urls to requests.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40232
2008-09-15 06:01:40 +00:00
Daniel Grana
31e497e5eb add media_to_download hook support, and return cached result if available
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40231
2008-09-15 04:33:40 +00:00
olveyra
df2f4b21b8 removed reference to guid (decobot specific) attribute in item
pipeline logs

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40230
2008-09-15 01:23:10 +00:00
Daniel Grana
3803f67645 mediamiddleware: bugfixes and remove prints
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40229
2008-09-12 21:55:14 +00:00
olveyra
2fd5cf644e - Support for not output in cluster status
- schedule option by default does not have output

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40228
2008-09-12 20:48:48 +00:00
Daniel Grana
d612701b66 add media pipeline skeleton
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40227
2008-09-12 19:58:45 +00:00
Andres Moreira
2dc20bb56e Fixed condition of new option --quiet.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40226
2008-09-11 16:35:50 +00:00
olveyra
4cffc039ff - leave in adaptors.py only the elemental adaptors pipeline code.
- moved to contrib/adaptorpipeline the advanced adaptors pipeline code.

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40225
2008-09-11 15:33:35 +00:00
Andres Moreira
23814537b3 Fixed bug in the total items count. Added new option --quiet to the action diff. This option doesn't show the report if there aren't differences.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40224
2008-09-11 14:59:16 +00:00
elpolilla
2b6bcd61df Small piece of code moved for not complying with conventions
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40223
2008-09-11 14:50:15 +00:00
elpolilla
06964cf36b some small semantics errors fixed in xpath test
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40222
2008-09-11 12:13:33 +00:00
elpolilla
d8b7f41ee5 implemented extract_unquote method over xpath selectors
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40221
2008-09-11 11:14:44 +00:00
olveyra
da7b87962a - AdaptorPipe compilation feature to resolve the problem of efficience
and mantain the simplicity of the actual pipeline definition approach (for
people that prefer to bore with lots of lines to define and edit a
pipeline, they can edit directly the compiled version)

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40220
2008-09-10 14:06:28 +00:00
elpolilla
789d8d3531 bugfix in decompressor tool
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40219
2008-09-10 02:49:05 +00:00
olveyra
ec710702ce Added getd command, similar to get but first decompress the response.
Based on decompressor tool.

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40218
2008-09-09 15:05:42 +00:00
elpolilla
543b9de973 pylinted decompressor
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40217
2008-09-09 14:12:57 +00:00
elpolilla
b363f9085d decompressor tool improved
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40216
2008-09-09 14:10:15 +00:00
anibal
98ac66f821 Added more flexibility to childclasses who override gespider command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40215
2008-09-09 14:10:00 +00:00
elpolilla
eddc7b7608 added sample data for decompression tool's test
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40214
2008-09-09 13:02:49 +00:00
elpolilla
7df63477bf added response decompression tool
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40213
2008-09-09 12:56:01 +00:00
olveyra
3a2981801c minor code fix
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40212
2008-09-08 12:06:53 +00:00
olveyra
3fe36e77d2 Removed PRIORITY constants, added DEFAULT_PRIORITY setting
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40211
2008-09-06 17:03:47 +00:00
olveyra
8b09be8601 adaptors with generic matching function
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40210
2008-09-05 21:51:53 +00:00
Andres Moreira
09b11a3a7b Added an histogram plot to simpages group to the report. Added quantities of html elements to symbol of the simhash. Now is more effective.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40209
2008-09-05 14:45:24 +00:00
olveyra
c2af3124ba - added negative attribute name match
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40208
2008-09-04 19:50:25 +00:00
olveyra
4e61f05761 second security fix
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40207
2008-09-04 19:33:36 +00:00
olveyra
51bdd6944b - removed contrib/adaptors.py
- display adaptor name when an exception raises inside the adaptor
pipeline
- add debug parameter to item attribute method to display input/output
of each adaptor

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40206
2008-09-04 19:11:53 +00:00
olveyra
daf51203e3 security fix
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40205
2008-09-04 18:15:45 +00:00
Andres Moreira
f4f9626c3f Remove old code.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40204
2008-09-03 19:11:52 +00:00
Andres Moreira
3cb1ab8794 Add rule engine to the framework. Rules are executed in a pipeline.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40203
2008-09-03 19:06:34 +00:00
olveyra
1fa947dd67 - Improved attribute name checks
- added support to tuple definition of pipeline

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40202
2008-09-03 13:53:28 +00:00
olveyra
82a9fa9ffc more efficient name attribute check in adaptors pipeline
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40201
2008-09-02 18:52:04 +00:00