1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-23 14:24:19 +00:00

280 Commits

Author SHA1 Message Date
olveyra
df2f4b21b8 removed reference to guid (decobot specific) attribute in item
pipeline logs

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40230
2008-09-15 01:23:10 +00:00
Daniel Grana
3803f67645 mediamiddleware: bugfixes and remove prints
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40229
2008-09-12 21:55:14 +00:00
olveyra
2fd5cf644e - Support for not output in cluster status
- schedule option by default does not have output

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40228
2008-09-12 20:48:48 +00:00
Daniel Grana
d612701b66 add media pipeline skeleton
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40227
2008-09-12 19:58:45 +00:00
Andres Moreira
2dc20bb56e Fixed condition of new option --quiet.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40226
2008-09-11 16:35:50 +00:00
olveyra
4cffc039ff - leave in adaptors.py only the elemental adaptors pipeline code.
- moved to contrib/adaptorpipeline the advanced adaptors pipeline code.

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40225
2008-09-11 15:33:35 +00:00
Andres Moreira
23814537b3 Fixed bug in the total items count. Added new option --quiet to the action diff. This option doesn't show the report if there aren't differences.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40224
2008-09-11 14:59:16 +00:00
elpolilla
2b6bcd61df Small piece of code moved for not complying with conventions
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40223
2008-09-11 14:50:15 +00:00
elpolilla
06964cf36b some small semantics errors fixed in xpath test
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40222
2008-09-11 12:13:33 +00:00
elpolilla
d8b7f41ee5 implemented extract_unquote method over xpath selectors
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40221
2008-09-11 11:14:44 +00:00
olveyra
da7b87962a - AdaptorPipe compilation feature to resolve the problem of efficience
and mantain the simplicity of the actual pipeline definition approach (for
people that prefer to bore with lots of lines to define and edit a
pipeline, they can edit directly the compiled version)

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40220
2008-09-10 14:06:28 +00:00
elpolilla
789d8d3531 bugfix in decompressor tool
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40219
2008-09-10 02:49:05 +00:00
olveyra
ec710702ce Added getd command, similar to get but first decompress the response.
Based on decompressor tool.

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40218
2008-09-09 15:05:42 +00:00
elpolilla
543b9de973 pylinted decompressor
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40217
2008-09-09 14:12:57 +00:00
elpolilla
b363f9085d decompressor tool improved
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40216
2008-09-09 14:10:15 +00:00
anibal
98ac66f821 Added more flexibility to childclasses who override gespider command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40215
2008-09-09 14:10:00 +00:00
elpolilla
eddc7b7608 added sample data for decompression tool's test
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40214
2008-09-09 13:02:49 +00:00
elpolilla
7df63477bf added response decompression tool
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40213
2008-09-09 12:56:01 +00:00
olveyra
3a2981801c minor code fix
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40212
2008-09-08 12:06:53 +00:00
olveyra
3fe36e77d2 Removed PRIORITY constants, added DEFAULT_PRIORITY setting
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40211
2008-09-06 17:03:47 +00:00
olveyra
8b09be8601 adaptors with generic matching function
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40210
2008-09-05 21:51:53 +00:00
Andres Moreira
09b11a3a7b Added an histogram plot to simpages group to the report. Added quantities of html elements to symbol of the simhash. Now is more effective.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40209
2008-09-05 14:45:24 +00:00
olveyra
c2af3124ba - added negative attribute name match
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40208
2008-09-04 19:50:25 +00:00
olveyra
4e61f05761 second security fix
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40207
2008-09-04 19:33:36 +00:00
olveyra
51bdd6944b - removed contrib/adaptors.py
- display adaptor name when an exception raises inside the adaptor
pipeline
- add debug parameter to item attribute method to display input/output
of each adaptor

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40206
2008-09-04 19:11:53 +00:00
olveyra
daf51203e3 security fix
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40205
2008-09-04 18:15:45 +00:00
Andres Moreira
f4f9626c3f Remove old code.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40204
2008-09-03 19:11:52 +00:00
Andres Moreira
3cb1ab8794 Add rule engine to the framework. Rules are executed in a pipeline.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40203
2008-09-03 19:06:34 +00:00
olveyra
1fa947dd67 - Improved attribute name checks
- added support to tuple definition of pipeline

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40202
2008-09-03 13:53:28 +00:00
olveyra
82a9fa9ffc more efficient name attribute check in adaptors pipeline
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40201
2008-09-02 18:52:04 +00:00
olveyra
6b16ee5e67 - assure deferred_degenerate will take an iterable (bug raised when no
spider middleware is enabled)

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40200
2008-09-02 17:39:22 +00:00
olveyra
f54ba9f7e9 removed unused import
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40199
2008-09-02 15:41:22 +00:00
olveyra
972896cd87 removed canonicalize from get function in shell
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40198
2008-09-02 15:20:16 +00:00
olveyra
2a30073ece fix get function (strip and canonicalize url)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40197
2008-09-02 14:12:17 +00:00
olveyra
7913a86250 updated settings template
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40196
2008-09-02 12:38:59 +00:00
olveyra
472a0de139 - Fixes in adaptors code, after testing
- added attrs_list param to insertadaptor method

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40195
2008-09-01 20:06:10 +00:00
Pablo Hoffman
41fa98801c removed unneeded exception code
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40194
2008-09-01 19:28:30 +00:00
Pablo Hoffman
a88fb416c7 changed Referer middleware class name
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40193
2008-09-01 04:28:18 +00:00
Pablo Hoffman
037e6c2125 improved SpiderMiddleware's docstrings
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40192
2008-09-01 04:18:12 +00:00
Pablo Hoffman
c9cafd5c43 added UrlFilterMiddleware
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40191
2008-09-01 04:16:51 +00:00
Pablo Hoffman
d7d94482a9 added update_fingerprint method to Request
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40190
2008-09-01 04:09:51 +00:00
Pablo Hoffman
96d24c7640 fixed some documentation errors
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40189
2008-09-01 03:34:53 +00:00
Pablo Hoffman
29c3715c5a changed remove_fragments argument to keep_fragments, for consistency with the other canonicalize_url arguments
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40188
2008-09-01 03:33:18 +00:00
Pablo Hoffman
30803b9e89 added canonicalize_url function to scrapy.utils.url, along with a complete suite of tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40187
2008-09-01 03:31:11 +00:00
Pablo Hoffman
34355048c3 some functions were added to scrapy.utils.url without following our policies for adding tests (to scrapy.tests) and documentation (as docstrings). fixed that.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40186
2008-09-01 01:19:32 +00:00
olveyra
f3013bb9ad Improved Adaptors code
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40185
2008-08-31 00:25:13 +00:00
olveyra
0e6562cb47 moved some url utils from decobot to scrapy
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40184
2008-08-27 17:37:32 +00:00
olveyra
9164150bed - avoid to raise an exception when no arg is given to replay command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40183
2008-08-27 17:21:12 +00:00
olveyra
b11b84fff1 - moved scrape command to shell
- fixes
- get and scrapehelp functions added as ipython magic commands

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40182
2008-08-27 13:52:45 +00:00
Pablo Hoffman
bfe6168f3b cleaned up simpages code a bit, added some documentation
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40181
2008-08-25 00:00:14 +00:00