1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-26 16:23:57 +00:00

64 Commits

Author SHA1 Message Date
Ismael Carnales
00d9ce6608 fixes to newitem doc 2009-07-21 12:22:49 -03:00
Ismael Carnales
894a5d81d3 changed the newitem API to a dict-like interface 2009-07-21 12:20:49 -03:00
Pablo Hoffman
9867e7b2d6 removed references to obsolete ENABLED_SPIDERS_FILE setting 2009-07-21 01:11:48 -03:00
Pablo Hoffman
92b746d866 SimpledbStatsCollector: moved domain creation to constructor 2009-07-17 12:49:53 -03:00
Pablo Hoffman
292757f312 added link to architecture overview and fixed old link 2009-07-16 19:15:19 -03:00
Pablo Hoffman
136014d718 Added section about relative xpaths to XPathSelectors doc 2009-07-16 17:29:29 -03:00
Pablo Hoffman
0fd603e27e Automated merge with http://hg.scrapy.org/users/ismael/scrapy-newitem/ 2009-07-16 11:16:46 -03:00
Pablo Hoffman
c2f12327d6 added dumping of global scrapy stats at engine shutdown 2009-07-16 10:48:10 -03:00
Ismael Carnales
a9069e8eaa reorganized experimental doc in topics and ref, slitted long documents, introduced minor changes 2009-07-16 09:58:44 -03:00
Pablo Hoffman
1125825996 doc: minor updates to tutorial 2009-07-15 22:10:00 -03:00
Pablo Hoffman
5582305648 renamed old STATS_DEBUG setting to STATS_DUMP 2009-07-15 21:02:51 -03:00
Pablo Hoffman
7f281cf295 some minor updates to doc 2009-07-15 01:40:44 -03:00
Pablo Hoffman
8480f49cf6 Complete stats refactoring and documentation
--HG--
rename : scrapy/stats/statscollector.py => scrapy/stats/collector/__init__.py
2009-07-15 00:30:40 -03:00
Ismael Carnales
0227c53b13 reorganize newitem documentation 2009-07-14 15:55:52 -03:00
Pablo Hoffman
7fb889349e fixed link to experimental doc 2009-07-14 08:46:15 -03:00
Pablo Hoffman
f1dc4e4af2 corrected TextField examples 2009-07-14 08:39:38 -03:00
Pablo Hoffman
a8aae41a13 removed unused files 2009-07-13 22:16:43 -03:00
Pablo Hoffman
5c39d173a5 merged proposed and experimental documentation, as it didn't make sense to keep two separate sections
--HG--
rename : docs/proposed/_images/scrapy_architecture.odg => docs/experimental/_images/scrapy_architecture.odg
rename : docs/proposed/_images/scrapy_architecture.png => docs/experimental/_images/scrapy_architecture.png
rename : docs/proposed/index.rst => docs/experimental/index.rst
rename : docs/proposed/newitem-fields.rst => docs/experimental/newitem-fields.rst
rename : docs/proposed/newitem.rst => docs/experimental/newitem.rst
2009-07-13 22:15:54 -03:00
Pablo Hoffman
26bb8ef608 doc: improved newitem fields reference 2009-07-13 22:05:48 -03:00
Pablo Hoffman
2bf39b7cdb minor layout cleanups to newitem doc 2009-07-13 22:05:18 -03:00
Pablo Hoffman
74fcfc2cfc deprecated old adaptors documentation 2009-07-13 22:03:56 -03:00
Ismael Carnales
47d937f36b only accept unicode strings in text fields 2009-07-13 15:54:48 -03:00
Ismael Carnales
d75afaa161 renamed StringField to TextField 2009-07-13 15:54:46 -03:00
Pablo Hoffman
dff510384b doc: updated SCHEDULER_MIDDLEWARES_BASE setting 2009-07-13 13:33:47 -03:00
Ismael Carnales
b44409a203 added TimeField to newitem 2009-07-13 10:31:32 -03:00
Pablo Hoffman
5054b67a02 improved newitems doc and marked robust scraped items as deprecated 2009-07-11 21:26:52 -03:00
Daniel Grana
d5d2c5c924 update documentation to recent pydispatcher import path change 2009-07-09 17:13:30 -03:00
Daniel Grana
18fbd7c7eb Automated merge with ssh://hg.scrapy.org/scrapy 2009-07-09 16:58:07 -03:00
Daniel Grana
eff8ea6173 remove response from item_passed and item_dropped signal api 2009-07-09 16:57:03 -03:00
Pablo Hoffman
5da32d9f6d fixed Sphinx warning 2009-07-09 16:50:13 -03:00
Pablo Hoffman
ae7333d598 added simplejson optional dependency to doc 2009-07-09 16:49:20 -03:00
Pablo Hoffman
60e7b80798 removed signal docs from core.signals module, to leave them only in once place (the doc) 2009-07-09 12:57:10 -03:00
Ismael Carnales
9e3e41f946 added more newitem documentation in proposed 2009-07-09 11:29:04 -03:00
Pablo Hoffman
b071681cd4 removed duplicated spiders doc (which used autodoc) 2009-07-09 11:14:33 -03:00
Pablo Hoffman
4f19115a80 removed old setting from default_settings.py, updated doc of CONCURRENT_ITEMS setting 2009-07-09 10:56:15 -03:00
Pablo Hoffman
8b26e49636 Added new ItemProcessor component to Scraper component 2009-07-08 23:48:06 -03:00
Pablo Hoffman
0c4c153819 improved Scrapy documentation index for better usability 2009-07-01 09:51:57 -03:00
Pablo Hoffman
80cd534f92 removed redundant botname from log lines 2009-06-25 16:48:04 -03:00
Pablo Hoffman
834ac9fca0 Some telnet console changes:
- added telnet console documentation
- added documentation for debugging memory leaks with guppy
- sorted out shell alises
- set default port (TELNETCONSOLE_PORT) to 6023
2009-06-23 16:08:58 -03:00
Pablo Hoffman
a8fd107ad4 engine: some extra simplifications and removed debug mode 2009-06-22 23:01:41 -03:00
Pablo Hoffman
400a54bf7c Added reasons when closing domains ('reason' argument to engine close_domain method), replaced old 'status' parameter by new 'reason' parameter in domain_closed signal. updated and improved signals doc 2009-06-21 22:00:16 -03:00
Pablo Hoffman
adccd9a04e Sorted out Duplicate Filter API.
--HG--
rename : scrapy/dupefilter/__init__.py => scrapy/contrib/dupefilter.py
2009-06-20 20:29:07 -03:00
Daniel Grana
47970e91bc core: Invert request priority meaning, a higher request.priority value means more priority 2009-06-20 19:23:26 -03:00
Daniel Grana
18b6fecc47 Remove custom redirection priority of request returned by downloadermiddleware 2009-06-20 19:19:07 -03:00
Pablo Hoffman
161335fe78 Added domain schedulers (whose functionality was previously mixed with the
Scrapy Scheduler) and removed domain prioritizers whose functionality became
duplicated by the new domain schedulers.
2009-06-19 17:55:54 -03:00
Pablo Hoffman
3cb12e5dd3 Moved init_domain functionality out of the engine (refs #88) and into the
spider level. A new spider (InitSpider, not yet documented) was added to
provide initialization facilities.

Also renamed make_request_from_url to make_requests_from_url and allowed it to
return iterables.
2009-06-18 14:43:56 -03:00
Pablo Hoffman
fd0e490157 added StatsMailer extension 2009-06-12 15:38:21 -03:00
Pablo Hoffman
4a1a01354b Added 'priority' attribute to Requests and removed old 'priority' argument passed through engine, scheduler and scheduler middleware calls 2009-06-11 22:25:47 -03:00
Pablo Hoffman
635ac1ca64 Simplified domain prioritizers, so that they don't receive domains in the
constructor (domain prioritizers will be refactored later anyway) and
simplified Scrapy Manager code thanks to this.

Added make_request_from_url method to BaseSpider, splitting funtionality to
create requests from URLs which was previously done all in start_requests.
2009-06-10 14:21:36 -03:00
Pablo Hoffman
1aac694343 updated settings doc 2009-05-28 13:52:56 -03:00