Pablo Hoffman
d7a88e8d77
fixed hack for IE6
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40238
2008-09-15 12:58:19 +00:00
Pablo Hoffman
31d3b2cf5a
added hack for IE6
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40237
2008-09-15 12:56:31 +00:00
Daniel Grana
4944762341
move bugtraps to avoid silence of bugs in media_downloaded or media_failed
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40236
2008-09-15 12:46:34 +00:00
Daniel Grana
ca695c8c79
small bugfix and add bugtraps
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40235
2008-09-15 08:57:51 +00:00
Daniel Grana
6407a744e3
remove item.image_urls from get_urls_from_item
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40234
2008-09-15 08:27:15 +00:00
Daniel Grana
af0808be95
add info object to public methods
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40233
2008-09-15 07:08:27 +00:00
Daniel Grana
4263c3a2e1
Adds docstrings to public methods, remove unused imports, normalize urls to requests.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40232
2008-09-15 06:01:40 +00:00
Daniel Grana
31e497e5eb
add media_to_download hook support, and return cached result if available
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40231
2008-09-15 04:33:40 +00:00
olveyra
df2f4b21b8
removed reference to guid (decobot specific) attribute in item
...
pipeline logs
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40230
2008-09-15 01:23:10 +00:00
Daniel Grana
3803f67645
mediamiddleware: bugfixes and remove prints
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40229
2008-09-12 21:55:14 +00:00
olveyra
2fd5cf644e
- Support for not output in cluster status
...
- schedule option by default does not have output
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40228
2008-09-12 20:48:48 +00:00
Daniel Grana
d612701b66
add media pipeline skeleton
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40227
2008-09-12 19:58:45 +00:00
Andres Moreira
2dc20bb56e
Fixed condition of new option --quiet.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40226
2008-09-11 16:35:50 +00:00
olveyra
4cffc039ff
- leave in adaptors.py only the elemental adaptors pipeline code.
...
- moved to contrib/adaptorpipeline the advanced adaptors pipeline code.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40225
2008-09-11 15:33:35 +00:00
Andres Moreira
23814537b3
Fixed bug in the total items count. Added new option --quiet to the action diff. This option doesn't show the report if there aren't differences.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40224
2008-09-11 14:59:16 +00:00
elpolilla
2b6bcd61df
Small piece of code moved for not complying with conventions
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40223
2008-09-11 14:50:15 +00:00
elpolilla
06964cf36b
some small semantics errors fixed in xpath test
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40222
2008-09-11 12:13:33 +00:00
elpolilla
d8b7f41ee5
implemented extract_unquote method over xpath selectors
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40221
2008-09-11 11:14:44 +00:00
olveyra
da7b87962a
- AdaptorPipe compilation feature to resolve the problem of efficience
...
and mantain the simplicity of the actual pipeline definition approach (for
people that prefer to bore with lots of lines to define and edit a
pipeline, they can edit directly the compiled version)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40220
2008-09-10 14:06:28 +00:00
elpolilla
789d8d3531
bugfix in decompressor tool
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40219
2008-09-10 02:49:05 +00:00
olveyra
ec710702ce
Added getd command, similar to get but first decompress the response.
...
Based on decompressor tool.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40218
2008-09-09 15:05:42 +00:00
elpolilla
543b9de973
pylinted decompressor
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40217
2008-09-09 14:12:57 +00:00
elpolilla
b363f9085d
decompressor tool improved
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40216
2008-09-09 14:10:15 +00:00
anibal
98ac66f821
Added more flexibility to childclasses who override gespider command
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40215
2008-09-09 14:10:00 +00:00
elpolilla
eddc7b7608
added sample data for decompression tool's test
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40214
2008-09-09 13:02:49 +00:00
elpolilla
7df63477bf
added response decompression tool
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40213
2008-09-09 12:56:01 +00:00
olveyra
3a2981801c
minor code fix
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40212
2008-09-08 12:06:53 +00:00
olveyra
3fe36e77d2
Removed PRIORITY constants, added DEFAULT_PRIORITY setting
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40211
2008-09-06 17:03:47 +00:00
olveyra
8b09be8601
adaptors with generic matching function
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40210
2008-09-05 21:51:53 +00:00
Andres Moreira
09b11a3a7b
Added an histogram plot to simpages group to the report. Added quantities of html elements to symbol of the simhash. Now is more effective.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40209
2008-09-05 14:45:24 +00:00
olveyra
c2af3124ba
- added negative attribute name match
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40208
2008-09-04 19:50:25 +00:00
olveyra
4e61f05761
second security fix
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40207
2008-09-04 19:33:36 +00:00
olveyra
51bdd6944b
- removed contrib/adaptors.py
...
- display adaptor name when an exception raises inside the adaptor
pipeline
- add debug parameter to item attribute method to display input/output
of each adaptor
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40206
2008-09-04 19:11:53 +00:00
olveyra
daf51203e3
security fix
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40205
2008-09-04 18:15:45 +00:00
Andres Moreira
f4f9626c3f
Remove old code.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40204
2008-09-03 19:11:52 +00:00
Andres Moreira
3cb1ab8794
Add rule engine to the framework. Rules are executed in a pipeline.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40203
2008-09-03 19:06:34 +00:00
olveyra
1fa947dd67
- Improved attribute name checks
...
- added support to tuple definition of pipeline
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40202
2008-09-03 13:53:28 +00:00
olveyra
82a9fa9ffc
more efficient name attribute check in adaptors pipeline
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40201
2008-09-02 18:52:04 +00:00
olveyra
6b16ee5e67
- assure deferred_degenerate will take an iterable (bug raised when no
...
spider middleware is enabled)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40200
2008-09-02 17:39:22 +00:00
olveyra
f54ba9f7e9
removed unused import
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40199
2008-09-02 15:41:22 +00:00
olveyra
972896cd87
removed canonicalize from get function in shell
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40198
2008-09-02 15:20:16 +00:00
olveyra
2a30073ece
fix get function (strip and canonicalize url)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40197
2008-09-02 14:12:17 +00:00
olveyra
7913a86250
updated settings template
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40196
2008-09-02 12:38:59 +00:00
olveyra
472a0de139
- Fixes in adaptors code, after testing
...
- added attrs_list param to insertadaptor method
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40195
2008-09-01 20:06:10 +00:00
Pablo Hoffman
41fa98801c
removed unneeded exception code
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40194
2008-09-01 19:28:30 +00:00
Pablo Hoffman
a88fb416c7
changed Referer middleware class name
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40193
2008-09-01 04:28:18 +00:00
Pablo Hoffman
037e6c2125
improved SpiderMiddleware's docstrings
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40192
2008-09-01 04:18:12 +00:00
Pablo Hoffman
c9cafd5c43
added UrlFilterMiddleware
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40191
2008-09-01 04:16:51 +00:00
Pablo Hoffman
d7d94482a9
added update_fingerprint method to Request
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40190
2008-09-01 04:09:51 +00:00
Pablo Hoffman
96d24c7640
fixed some documentation errors
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40189
2008-09-01 03:34:53 +00:00