1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-28 17:23:37 +00:00

218 Commits

Author SHA1 Message Date
Ismael Carnales
202894dd8f fixes to newitem doc 2009-07-22 10:25:22 -03:00
Pablo Hoffman
7d8ba0542c fixed typo in doc 2009-07-21 17:38:46 -03:00
Pablo Hoffman
aa345e116a Added spider middleware documentation 2009-07-21 17:19:19 -03:00
Ismael Carnales
90f1d9e489 Added Scheduler middleware reference documentation 2009-07-21 16:52:27 -03:00
Ismael Carnales
00d9ce6608 fixes to newitem doc 2009-07-21 12:22:49 -03:00
Ismael Carnales
894a5d81d3 changed the newitem API to a dict-like interface 2009-07-21 12:20:49 -03:00
Pablo Hoffman
9867e7b2d6 removed references to obsolete ENABLED_SPIDERS_FILE setting 2009-07-21 01:11:48 -03:00
Pablo Hoffman
92b746d866 SimpledbStatsCollector: moved domain creation to constructor 2009-07-17 12:49:53 -03:00
Pablo Hoffman
292757f312 added link to architecture overview and fixed old link 2009-07-16 19:15:19 -03:00
Pablo Hoffman
136014d718 Added section about relative xpaths to XPathSelectors doc 2009-07-16 17:29:29 -03:00
Pablo Hoffman
0fd603e27e Automated merge with http://hg.scrapy.org/users/ismael/scrapy-newitem/ 2009-07-16 11:16:46 -03:00
Pablo Hoffman
c2f12327d6 added dumping of global scrapy stats at engine shutdown 2009-07-16 10:48:10 -03:00
Ismael Carnales
a9069e8eaa reorganized experimental doc in topics and ref, slitted long documents, introduced minor changes 2009-07-16 09:58:44 -03:00
Pablo Hoffman
1125825996 doc: minor updates to tutorial 2009-07-15 22:10:00 -03:00
Pablo Hoffman
5582305648 renamed old STATS_DEBUG setting to STATS_DUMP 2009-07-15 21:02:51 -03:00
Pablo Hoffman
7f281cf295 some minor updates to doc 2009-07-15 01:40:44 -03:00
Pablo Hoffman
8480f49cf6 Complete stats refactoring and documentation
--HG--
rename : scrapy/stats/statscollector.py => scrapy/stats/collector/__init__.py
2009-07-15 00:30:40 -03:00
Ismael Carnales
0227c53b13 reorganize newitem documentation 2009-07-14 15:55:52 -03:00
Pablo Hoffman
7fb889349e fixed link to experimental doc 2009-07-14 08:46:15 -03:00
Pablo Hoffman
f1dc4e4af2 corrected TextField examples 2009-07-14 08:39:38 -03:00
Pablo Hoffman
a8aae41a13 removed unused files 2009-07-13 22:16:43 -03:00
Pablo Hoffman
5c39d173a5 merged proposed and experimental documentation, as it didn't make sense to keep two separate sections
--HG--
rename : docs/proposed/_images/scrapy_architecture.odg => docs/experimental/_images/scrapy_architecture.odg
rename : docs/proposed/_images/scrapy_architecture.png => docs/experimental/_images/scrapy_architecture.png
rename : docs/proposed/index.rst => docs/experimental/index.rst
rename : docs/proposed/newitem-fields.rst => docs/experimental/newitem-fields.rst
rename : docs/proposed/newitem.rst => docs/experimental/newitem.rst
2009-07-13 22:15:54 -03:00
Pablo Hoffman
26bb8ef608 doc: improved newitem fields reference 2009-07-13 22:05:48 -03:00
Pablo Hoffman
2bf39b7cdb minor layout cleanups to newitem doc 2009-07-13 22:05:18 -03:00
Pablo Hoffman
74fcfc2cfc deprecated old adaptors documentation 2009-07-13 22:03:56 -03:00
Ismael Carnales
47d937f36b only accept unicode strings in text fields 2009-07-13 15:54:48 -03:00
Ismael Carnales
d75afaa161 renamed StringField to TextField 2009-07-13 15:54:46 -03:00
Pablo Hoffman
dff510384b doc: updated SCHEDULER_MIDDLEWARES_BASE setting 2009-07-13 13:33:47 -03:00
Ismael Carnales
b44409a203 added TimeField to newitem 2009-07-13 10:31:32 -03:00
Pablo Hoffman
5054b67a02 improved newitems doc and marked robust scraped items as deprecated 2009-07-11 21:26:52 -03:00
Daniel Grana
d5d2c5c924 update documentation to recent pydispatcher import path change 2009-07-09 17:13:30 -03:00
Daniel Grana
18fbd7c7eb Automated merge with ssh://hg.scrapy.org/scrapy 2009-07-09 16:58:07 -03:00
Daniel Grana
eff8ea6173 remove response from item_passed and item_dropped signal api 2009-07-09 16:57:03 -03:00
Pablo Hoffman
5da32d9f6d fixed Sphinx warning 2009-07-09 16:50:13 -03:00
Pablo Hoffman
ae7333d598 added simplejson optional dependency to doc 2009-07-09 16:49:20 -03:00
Pablo Hoffman
60e7b80798 removed signal docs from core.signals module, to leave them only in once place (the doc) 2009-07-09 12:57:10 -03:00
Ismael Carnales
9e3e41f946 added more newitem documentation in proposed 2009-07-09 11:29:04 -03:00
Pablo Hoffman
b071681cd4 removed duplicated spiders doc (which used autodoc) 2009-07-09 11:14:33 -03:00
Pablo Hoffman
4f19115a80 removed old setting from default_settings.py, updated doc of CONCURRENT_ITEMS setting 2009-07-09 10:56:15 -03:00
Pablo Hoffman
8b26e49636 Added new ItemProcessor component to Scraper component 2009-07-08 23:48:06 -03:00
Pablo Hoffman
0c4c153819 improved Scrapy documentation index for better usability 2009-07-01 09:51:57 -03:00
Pablo Hoffman
80cd534f92 removed redundant botname from log lines 2009-06-25 16:48:04 -03:00
Pablo Hoffman
834ac9fca0 Some telnet console changes:
- added telnet console documentation
- added documentation for debugging memory leaks with guppy
- sorted out shell alises
- set default port (TELNETCONSOLE_PORT) to 6023
2009-06-23 16:08:58 -03:00
Pablo Hoffman
a8fd107ad4 engine: some extra simplifications and removed debug mode 2009-06-22 23:01:41 -03:00
Pablo Hoffman
400a54bf7c Added reasons when closing domains ('reason' argument to engine close_domain method), replaced old 'status' parameter by new 'reason' parameter in domain_closed signal. updated and improved signals doc 2009-06-21 22:00:16 -03:00
Pablo Hoffman
adccd9a04e Sorted out Duplicate Filter API.
--HG--
rename : scrapy/dupefilter/__init__.py => scrapy/contrib/dupefilter.py
2009-06-20 20:29:07 -03:00
Daniel Grana
47970e91bc core: Invert request priority meaning, a higher request.priority value means more priority 2009-06-20 19:23:26 -03:00
Daniel Grana
18b6fecc47 Remove custom redirection priority of request returned by downloadermiddleware 2009-06-20 19:19:07 -03:00
Pablo Hoffman
161335fe78 Added domain schedulers (whose functionality was previously mixed with the
Scrapy Scheduler) and removed domain prioritizers whose functionality became
duplicated by the new domain schedulers.
2009-06-19 17:55:54 -03:00
Pablo Hoffman
3cb12e5dd3 Moved init_domain functionality out of the engine (refs #88) and into the
spider level. A new spider (InitSpider, not yet documented) was added to
provide initialization facilities.

Also renamed make_request_from_url to make_requests_from_url and allowed it to
return iterables.
2009-06-18 14:43:56 -03:00