1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-26 17:44:02 +00:00

1394 Commits

Author SHA1 Message Date
Daniel Grana
3901e06f44 Automated merge with ssh://hg.scrapy.org/scrapy 2009-08-07 21:24:41 -03:00
Daniel Grana
9aaf010af3 add stats of memory usage 2009-08-07 21:24:36 -03:00
Pablo Hoffman
14148b6ed8 fixed unittest codes broken in previous commit 2009-08-07 14:45:28 -03:00
Pablo Hoffman
db90e26a8b renamed ItemLoader class to Loader 2009-08-07 14:39:30 -03:00
Pablo Hoffman
e585c6cac4 relocated experimental newitems/loaders doc, and added example for extending fields metadata
--HG--
rename : docs/experimental/newitem-loader.rst => docs/experimental/loaders.rst
rename : docs/experimental/newitem.rst => docs/experimental/newitems.rst
2009-08-07 14:28:58 -03:00
Pablo Hoffman
d95e99f585 Added documentation for Items and Loaders, removed obsolete Item Adaptors documentation
--HG--
rename : docs/experimental/topics/newitem/index.rst => docs/experimental/newitem.rst
2009-08-07 03:50:09 -03:00
Pablo Hoffman
efa08318be renamed JoinStrings reducer to Join, accept item as first positional argument in ItemLoader constructor, removed expanders and reducers docstrings (will be moved to documentation) 2009-08-07 03:48:42 -03:00
Pablo Hoffman
3658acd9da newitem: reverting to use 'default' Field key instead of 'default_factory' 2009-08-06 21:29:40 -03:00
Pablo Hoffman
78b69ec97e merge 2009-08-06 14:35:52 -03:00
Pablo Hoffman
d142489b81 remove_entities: added test for encoding argument 2009-08-06 14:35:24 -03:00
Pablo Hoffman
f2c2609e42 remove_entities: added support for common browser hack for numeric character references in the 80-9F range 2009-08-06 14:31:59 -03:00
Pablo Hoffman
c3f7bfea1f remove_entities: added encoding argument, and removed some empty lines 2009-08-06 14:26:38 -03:00
Daniel Grana
82dc2096fe Automated merge with ssh://hg.scrapy.org/scrapy 2009-08-06 12:28:49 -03:00
Daniel Grana
6ef991b71f normalize times used for stats to UTC 2009-08-06 12:07:22 -03:00
Pablo Hoffman
80db8abee2 use time.time() instead of datetime in SpiderProfiler extensions, which is faster and simpler 2009-08-06 11:56:32 -03:00
Pablo Hoffman
de48406f10 added 3 common content-types (for feeds) to ResponseTypes class 2009-08-06 11:37:20 -03:00
Pablo Hoffman
a23ff37050 ItemLoader: added one more test and improved other test names 2009-08-05 11:38:01 -03:00
Pablo Hoffman
7bc7af0162 ItemLoader: some more code cleanups, and added many more tests 2009-08-05 00:41:02 -03:00
Pablo Hoffman
9081e84e27 ItemLoader: sorted out module locations, and added more tests
--HG--
rename : scrapy/newitem/reducers.py => scrapy/newitem/loader/reducers.py
2009-08-04 20:02:49 -03:00
Pablo Hoffman
ac9f4c9cc2 Refatored scrapy.newitem package:
- left only one type of Field - just a dict wrapper to contain field metadata
- removed Item Builder and tests
- adapted Item Loader to work with new Field class
2009-08-04 19:26:31 -03:00
Pablo Hoffman
114dba2850 some minor simplifications to tree_expander() function 2009-08-04 09:10:21 -03:00
Pablo Hoffman
8d705ec302 added ItemLoader class, an alternative implementation of ItemBuilder with a slightly different API 2009-08-03 22:53:08 -03:00
Ismael Carnales
f05695d75e added first implementation of ItemBuilder 2009-08-03 17:27:57 -03:00
Ismael Carnales
32894643a0 added ListField documentation, ordered field reference alphabetically 2009-08-03 15:00:04 -03:00
Ismael Carnales
cf638e682c made ListField init with an instance (not a class) of Field 2009-08-03 15:00:02 -03:00
Ismael Carnales
d8b85ae7ad moved MultiValuedField to ListField 2009-08-02 18:43:35 -03:00
Pablo Hoffman
fcb33c7988 removed obsolete pipeline 2009-07-31 17:24:30 -03:00
Pablo Hoffman
7c049d2ef5 use standard 'mcs' for first argument of meta class __new__ method 2009-07-31 16:51:29 -03:00
Pablo Hoffman
02c454c26e newitem: added warning when trying to access item field value via getattr instead of getitem 2009-07-31 16:49:27 -03:00
Pablo Hoffman
c3427e075c added domain_stats parameter to stats_domain_closed signal 2009-07-31 16:36:35 -03:00
Pablo Hoffman
73172b244d added from_unicode_list() method to Field objects 2009-07-30 16:58:24 -03:00
Pablo Hoffman
c0dcd76424 moved unused scrapy.core.scheduler.store module to scrapy.contrib_exp.history
--HG--
rename : scrapy/core/scheduler/store.py => scrapy/contrib_exp/history/memorystore.py
rename : scrapy/contrib_exp/history/store.py => scrapy/contrib_exp/history/sqlstore.py
2009-07-30 13:51:43 -03:00
Daniel Grana
8ecd16b5e3 remove obsolete monkey patch for twisted 2.5 2009-07-29 21:02:15 -03:00
Pablo Hoffman
b336de7302 fixed minor bugs with spiderctl webconsole extension 2009-07-29 19:50:38 -03:00
Pablo Hoffman
4a1fc74d69 fixed spiderstats webconsole extension 2009-07-29 19:26:12 -03:00
Pablo Hoffman
01b79e386f fixed StatsCollector webconsole extension 2009-07-29 19:15:19 -03:00
Pablo Hoffman
a86b12a879 moved patches.py to xlib/patches.py to avoid import errors
--HG--
rename : scrapy/patches.py => scrapy/xlib/patches.py
2009-07-29 19:01:36 -03:00
Pablo Hoffman
8ec7c9e01c WEBCONSOLE_PORT setting now defaults to 6080 2009-07-29 18:59:34 -03:00
Pablo Hoffman
fde4f219a8 don't use packages when modules are enough
--HG--
rename : scrapy/extension/__init__.py => scrapy/extension.py
rename : scrapy/fetcher/__init__.py => scrapy/fetcher.py
rename : scrapy/log/__init__.py => scrapy/log.py
rename : scrapy/mail/__init__.py => scrapy/mail.py
rename : scrapy/patches/monkeypatches.py => scrapy/patches.py
2009-07-29 18:41:15 -03:00
Pablo Hoffman
230bcef7b6 some refactor of settings manager (without changing API): don't fail if no settings module is found (fail on scrapy.cmdline instead). also, some code improvements for clarity. 2009-07-29 18:26:29 -03:00
Pablo Hoffman
d57c0100db another minor fix to exporters 2009-07-29 08:23:38 -03:00
Pablo Hoffman
c3d732f28a another bug fix in pprint item exporter 2009-07-28 23:00:35 -03:00
Pablo Hoffman
643f93721b removed wrong self from base constructor calls 2009-07-28 20:53:03 -03:00
Pablo Hoffman
bb7b6815c0 added Item Exporters 2009-07-28 20:20:44 -03:00
Pablo Hoffman
75cf903e24 adapted project template to use the new Link Extractors location 2009-07-28 12:27:25 -03:00
Pablo Hoffman
64d9155572 CloseDomain extension: fixed bug on domain close when not using CLOSEDOMAIN_TIMEOUT 2009-07-28 12:23:13 -03:00
Pablo Hoffman
fcc91901eb finished cleaning up closedomain documentation, and updated default settings 2009-07-27 15:42:35 -03:00
Pablo Hoffman
09ba6927d7 Some changes to CloseDomain extension:
- added support for closing by item passed count (CLOSEDOMAIN_ITEMPASSED)
- removed support for sending notification emails (since that's the job of
  another extension)
2009-07-27 15:23:50 -03:00
Pablo Hoffman
9da66698f3 moved httprepr() method (from Request and Response objects) to scrapy.utils functions 2009-07-25 18:56:12 -03:00
Daniel Grana
bed3c38014 fix delayedclosedomain extension bug due to changing lastseen from datetime to time.time 2009-07-25 15:22:15 -03:00