Pablo Hoffman
234fd709ad
fixed doc typo (thanks Victor)
2010-03-19 10:32:17 -03:00
Daniel Grana
184cf6684f
Remove HttpException references from docs. Since 0.7, scrapy returns non-200 as Response objects and does not raise HttpException anymore
2010-03-18 10:05:33 -03:00
Daniel Grana
17091902f3
Explicity say where to save item class in "Defining our item" section of tutorial
2010-03-12 14:12:49 -02:00
Daniel Grana
c925c9e9a0
Notify spider when requests are ignored by HttpErrorMiddleware, and generally when any call to process_spider_input raises an exception
2010-05-12 16:41:06 -03:00
Daniel Grana
c0d45846b8
Automated merge with ssh://hg.scrapy.org/scrapy-0.8
2010-04-26 22:29:45 -03:00
Pablo Hoffman
81f6502e37
Automated merge with http://hg.scrapy.org/scrapy-0.8/
2010-04-24 18:22:13 -03:00
Daniel Grana
658e6f15e9
Automated merge with ssh://hg.scrapy.org/scrapy-0.8
2010-04-18 23:44:59 -03:00
Daniel Grana
68a875edb0
update ENCODING_ALIASES setting default value in settings documentation topic
2010-04-07 10:54:54 -03:00
Pablo Hoffman
de32612c99
Automated merge with http://hg.scrapy.org/scrapy-0.8
2010-04-02 02:49:51 -03:00
Rolando Espinoza La fuente
db5c3df679
SEP12 implementation
...
* Rename BaseSpider.domain_name to BaseSpider.name
This patch implements the domain_name to name change in BaseSpider class and
change all spider instantiations to use the new attribute.
* Add allowed_domains to spider
This patch implements the merging of spider.domain_name and
spider.extra_domain_names in spider.allowed_domains for offsite checking
purposes.
Note that spider.domain_name is not touched by this patch, only not used.
* Remove spider.domain_name references from scrapy.stats
* Rename domain_stats to spider_stats in MemoryStatsCollector
* Use ``spider`` instead of ``domain`` in SimpledbStatsCollector
* Rename domain_stats_history table to spider_data_history and rename domain
field to spider in MysqlStatsCollector
* Refactor genspider command
The new signature for genspider is: genspider [options] <domain_name>.
Genspider uses domain_name for spider name and for the module name.
* Remove spider.domain_name references
* Update crawl command signature <spider|url>
* docs: updated references to domain_name
* examples/experimental: use spider.name
* genspider: require <name> <domain>
* spidermanager: renamed crawl_domain to crawl_spider_name
* spiderctl: updated references of *domain* to spider
* added backward compatiblity with legacy spider's attributes
'domain_name' and 'extra_domain_names'
2010-04-01 18:27:22 -03:00
Pablo Hoffman
2299deda66
updated wrong link in doc
2010-03-26 14:02:33 -03:00
Pablo Hoffman
7cf2f87e27
Automated merge with http://hg.scrapy.org/scrapy-0.8
2010-03-26 08:29:34 -03:00
Pablo Hoffman
1330697c3d
Some improvements to Response encoding support:
...
* added encoding aliases, configurable through a new ENCODING_ALIASES setting
* Response.encoding now returns the real encoding detected for the body
* simplified TextResponse API by removing body_encoding() and
headers_encoding() methods
* Response.encoding now tries to infer the encoding from the body always (it
was done before only on HtmlResponse and TextResponse)
* removed scrapy.utils.encoding.add_encoding_alias() function
* updated implementation of scrapy.utils.response function to reflect these API
changes
* updated documentation to reflect API changes
2010-03-25 15:47:10 -03:00
Pablo Hoffman
9ddcd1095d
sort setting alphabetically
2010-03-25 11:45:06 -03:00
Pablo Hoffman
4fa833c849
Added LOG_ENCODING setting
2010-03-24 12:13:38 -03:00
Pablo Hoffman
87e68e7438
Made MailSender non IO-blocking, and improved MailSender documentation
2010-03-22 13:37:37 -03:00
Pablo Hoffman
1dfc79b5d0
Automated merge with http://hg.scrapy.org/scrapy-0.8
2010-03-20 20:48:11 -03:00
Pablo Hoffman
264cd2e035
Automated merge with http://hg.scrapy.org/scrapy-0.8
2010-03-19 10:32:42 -03:00
Pablo Hoffman
d12cd22d5e
switched default scheduler order to DFO, which consumes less memory by default
2010-03-04 10:15:58 -02:00
Pablo Hoffman
180c091fb2
Fixed encoding issue (reported in #135 ) when the encoding declared in the HTTP header is unknown. This is the patch proposed by Rolando, with an update to the Request/Response documentation.
2010-02-24 14:01:29 -02:00
Pablo Hoffman
bbef0fe870
Automated merge with http://hg.scrapy.org/users/rolando/scrapy/
2010-02-20 11:12:37 -02:00
Pablo Hoffman
a3d22c7240
Automated merge with http://hg.scrapy.org/scrapy-0.8/
2010-02-19 23:11:24 -02:00
Pablo Hoffman
60961e5499
minor documentation fix (refs #135 )
2010-02-19 23:09:48 -02:00
Pablo Hoffman
c1f8198639
Added RANDOMIZE_DOWNLOAD_DELAY setting
2010-02-19 21:53:18 -02:00
Rolando Espinoza La fuente
a6a3f085a7
docs: added crawlspider v2 outline documentation
...
Sign-Off: Rolando Espinoza La fuente
2010-02-19 18:22:38 -04:00
Rolando Espinoza La fuente
7235040936
merged upstream
2010-02-19 17:41:45 -04:00
Daniel Grana
91f4d6dc51
docs: adds another spider example that yields multiples requests/items from a single callback
2010-02-18 16:51:05 -02:00
Pablo Hoffman
57d60eae39
sort settings doc alphabetically by setting name
2010-01-31 18:11:13 -02:00
Pablo Hoffman
67858af83c
fixed doc typo
2010-01-18 18:16:58 -02:00
Pablo Hoffman
08eeaf98a2
fixed description of LOG_STDOUT setting
2010-01-13 15:51:08 -02:00
Pablo Hoffman
48739ae60c
install.rst: added explanation about why libxml2 2.6.28 or above is required
2010-01-13 12:20:24 -02:00
Rolando Espinoza La fuente
1402da31c5
docs: fixed typos and updated code examples
2010-01-11 12:28:22 -04:00
Pablo Hoffman
d60412ce19
titlecased Scrapy easy_install and some fixes to sign_release.sh script
2009-12-13 14:23:31 -02:00
Pablo Hoffman
422d6facb2
Automated merge with http://hg.scrapy.org/scrapy-stable
2009-12-12 16:52:07 -02:00
Pablo Hoffman
9d50604d24
added |version| to documentation title
2009-12-12 16:51:59 -02:00
Pablo Hoffman
a953efd8e5
Automated merge with http://hg.scrapy.org/scrapy-stable
2009-12-12 15:40:16 -02:00
Ismael Carnales
4ecc909bc1
Fix RobotsTxtMiddleware reference in doc
2009-12-04 15:37:24 -02:00
Ismael Carnales
07344666e2
Move webconsole extensions doc to webconsole topic
2009-12-01 10:47:11 -02:00
Ismael Carnales
e694c8ed02
Remove domain references in close spider extension doc
2009-11-30 11:38:56 -02:00
Ismael Carnales
12a7ff7312
Rename Close domain to close spider in extensions doc
2009-11-30 11:36:18 -02:00
Ismael Carnales
8d9cedd88b
Reorder signals doc to respect alphabetical order
2009-11-30 11:29:19 -02:00
Ismael Carnales
93cc3d2715
Correct param formatting in item pipelines doc
2009-11-30 11:04:15 -02:00
Pablo Hoffman
6084be3b2e
added iter_all() function to scrapy.util.trackref module and improved memory leaks documentation. also added a new FAQ antry about memory issues
2009-11-28 16:21:59 -02:00
Pablo Hoffman
dd662e09d8
some minor fixes to scheduler middleware doc
2009-11-19 12:23:54 -02:00
Pablo Hoffman
f4e93700bd
Automated merge with http://hg.scrapy.org/scrapy-stable/
2009-11-19 10:44:02 -02:00
Pablo Hoffman
c4f77c4da0
minor fixes to images doc (thanks amccloud)
2009-11-16 11:15:25 -02:00
Pablo Hoffman
0d6aee1f12
updated wrong documentation
2009-11-13 20:03:56 -02:00
Pablo Hoffman
aeab5370cb
StatsCollector: ported methods to receive spider instances ( closes #113 ), removed list_domains() method, added iter_spider_stats() method
2009-11-14 20:28:59 -02:00
Pablo Hoffman
c4c6e7c8cd
Automated merge with http://hg.scrapy.org/scrapy-stable/
2009-11-13 20:04:39 -02:00
Pablo Hoffman
07655d05ea
renamed REQUESTS_PER_SPIDER setting to CONCURRENT_REQUESTS_PER_SPIDER
2009-11-13 14:38:22 -02:00