Pablo Hoffman
cfd11df539
Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12
2011-02-24 15:28:57 -02:00
Pablo Hoffman
8f7e163b04
Fixed wrong method name in downloader middleware documentation
2011-02-24 15:26:32 -02:00
Daniel Grana
c55355642c
fix FAQ typos reported by marlun_ at #scrapy IRC channel
2011-02-16 08:57:42 -02:00
Pablo Hoffman
1fb55bdaf0
Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12
2011-02-15 07:25:12 -02:00
Pablo Hoffman
16d9a33951
added FAQ entry about working with big data feeds
2011-02-15 07:24:52 -02:00
Pablo Hoffman
936353d5f1
Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12
2011-02-09 11:20:46 -02:00
Pablo Hoffman
181d1c09ae
Fixed typo and code indentation in the doc. Closes #307 and #308
2011-02-09 11:19:46 -02:00
Pablo Hoffman
c91f0d9ea1
Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12
2011-02-04 13:39:54 -02:00
Pablo Hoffman
c5499ead73
Clarified behaviour when multiple rules match the same link in CrawlSpider
2011-02-04 13:39:12 -02:00
Pablo Hoffman
d7f193cbea
bumped version to 0.13 in documentation
2011-01-02 17:29:43 -02:00
Pablo Hoffman
b56e933be9
bumped version to 0.12 in documentation
2011-01-02 17:28:33 -02:00
Pablo Hoffman
5879389ad0
Bumped version to 0.12
2011-01-02 16:16:40 -02:00
Pablo Hoffman
fa644f7a5e
Some simplifications to Scrapyd architecture and internals:
...
- launcher no longer knows about egg storage
- removed get_spider_list_from_eggifile() file and replaced by simpler
get_spider_list() which doesn't receive en egg file as argument
- changed "egg runner" name to just "runner" to reflect the fact that it
doesn't necesarilly run eggs (though it does in the default case)
--HG--
rename : scrapyd/eggrunner.py => scrapyd/runner.py
2010-12-27 16:22:32 -02:00
Pablo Hoffman
633ebc4c43
minor indentation improvement
2010-12-23 13:04:49 -02:00
Pablo Hoffman
db07a9a938
Added notice to documentation, pointing dev to stable versions and viceversa
2010-12-23 13:03:40 -02:00
Pablo Hoffman
544308d6d0
updated ubuntu repos doc, in preparation for the 0.11 release
2010-12-21 11:02:56 -02:00
Pablo Hoffman
002abf204f
Updated item_passed signal to send passed item in 'item' argument, instead of 'output' argument, keeping backwards compatibility for the 'output' argument. Closes #273
2010-12-13 14:05:47 -02:00
Pablo Hoffman
f984d438a0
updated docs to use scrapy version on aptitude install lines
2010-12-13 14:02:42 -02:00
Pablo Hoffman
119fd20e91
Added verbose option to 'version' command. Closes #298
2010-12-13 00:32:44 -02:00
Pablo Hoffman
6a1b69c93f
renamed command 'scrapyd' to 'server', and deprecated 'runserver' and 'queue' commands
...
--HG--
rename : scrapy/commands/scrapyd.py => scrapy/commands/server.py
2010-11-30 20:23:27 -02:00
Pablo Hoffman
df54ed0041
Some Scrapyd enhancements:
...
* added minimal web ui
* return unique id per job (spider scheduled)
* store one log per spider run (job) and rotate them, keeping the last N logs (where N is configurable through settings)
2010-11-30 02:26:31 -02:00
Pablo Hoffman
bbffa59497
Some changes to Scrapyd:
...
* Always start one process per spider
* Added max_proc_per_cpu option (defaults to 4)
* Return the number of spiders (instead of a list of them) in schedule.json
2010-11-29 17:19:05 -02:00
Pablo Hoffman
2557777c39
Updated doc referring to HTTP cache middleware
2010-11-24 13:27:44 -02:00
Pablo Hoffman
426b6fa100
docs/intro/install.rst: added -U flag to easy_install command
2010-11-22 13:50:19 -02:00
Pablo Hoffman
91a7c25797
* Made Response.meta attribute map to Request.meta attribute. Closes #290
...
* Record redirected URLs in redirect middleware. Closes #291
2010-11-18 12:51:54 -02:00
Pablo Hoffman
ac007802d6
Simplified installation guide, including lxml as alternative dependency to libxml2. Closes #280
2010-11-17 21:32:23 -02:00
Pablo Hoffman
5a5364d0c1
Updated documentation to point out that simplejson is now required if using Python 2.5, and to recommended switching to Python 2.6
2010-11-16 03:31:04 -02:00
Pablo Hoffman
d988ca1ec2
Some changes to scrapy deploy command:
...
* changed deploy section names to [deploy:target]
* project is now passed through a -p|--project option
* version can now be set in the target configuration
* switched meaning of -l and -L options
* updated documentation accordingly
2010-11-08 17:01:06 -02:00
Pablo Hoffman
0f69e7a191
Some changes to HTTP Cache middleware:
...
* made it use the project data storage by default (closes #279 )
* added HTTPCACHE_ENABLED setting (False by default) to enable it
* made HTTPCACHE_DIR = 'httpcache' by default (inside the project data storage)
* simplified HTTPCACHE_EXPIRATION_SECS semantics: zero means don't expire,
dropped support for negative numbers
* other minor doc improvements
2010-11-01 02:38:15 -02:00
Pablo Hoffman
3c94c6cb9b
fixed sphinx doc id
2010-11-01 02:31:20 -02:00
dfdeshom
130276605b
Bind the web server and telnet server to a configurable interface (WEBSERVICE_HOST). The default is to bind to all interfaces. Also add documentation for WEBSERVICE_HOST and TELNETCONSOLE_HOST.
2010-11-01 00:59:04 -02:00
Pablo Hoffman
b76c5c597f
* Added support for project data storage ( closes #276 )
...
* Documented project file structure
* Moved default location of SQLite database to project data storage dir (closes #277 )
2010-10-31 03:25:37 -02:00
Pablo Hoffman
dfa6745e91
Automated merge with http://hg.scrapy.org/scrapy-0.10
2010-10-30 16:05:53 -02:00
Pablo Hoffman
a0d9b43031
fixed typo in scrapyd doc
2010-10-30 16:05:32 -02:00
Pablo Hoffman
d67152ab0f
Automated merge with http://hg.scrapy.org/scrapy-0.10
2010-10-30 01:56:12 -02:00
Pablo Hoffman
75451cbe84
scrapyd doc: fixed delversion.json example
2010-10-30 01:56:00 -02:00
Pablo Hoffman
a59bfb539d
* Added lxml backend for XPath selectors. Closes #147
...
* Added new setting (SELECTORS_BACKEND) to choose which backend to use
* Deprecated the extract_unquoted() function from selectors
* Made libxml2 optional by adding a dummy selector backend. Closes #260
--HG--
rename : scrapy/tests/test_selector.py => scrapy/tests/test_selector_libxml2.py
2010-10-25 14:47:10 -02:00
Pablo Hoffman
6c921896a5
Expanded documentation on deploy command and versions. Refs #261
2010-10-19 00:11:45 -02:00
Pablo Hoffman
1d567cdce6
Added new 'deploy' command. Closes #261
2010-10-18 22:38:46 -02:00
Pablo Hoffman
7d8f922df9
Added documentation for CLOSESPIDER_ERRORCOUNT setting. Refs #254
2010-10-18 22:36:30 -02:00
Pablo Hoffman
c96f17c43d
Automated merge with http://hg.scrapy.org/scrapy-0.10
2010-10-18 03:21:21 -02:00
Pablo Hoffman
98662e53ea
Formatting fix in Scrapyd doc
2010-10-17 03:20:23 -02:00
Pablo Hoffman
a3d85da96f
Automated merge with http://hg.scrapy.org/scrapy-0.10
2010-10-16 19:54:24 -02:00
Pablo Hoffman
5f65c26080
Some minor improvements to feature list in Scrapy at a Glance documentation page
2010-10-16 19:02:08 -02:00
Pablo Hoffman
d5c8caf07b
Automated merge with http://hg.scrapy.org/scrapy-0.10
2010-10-10 20:31:38 -02:00
Pablo Hoffman
b4fbc6c5fa
Updated Scrapy Tutorial to reference feed exports, instead a custom written pipeline, and extended item pipeline documentation to include a JSON writer.
2010-10-10 20:31:05 -02:00
Pablo Hoffman
aa4142e4ba
Automated merge with http://hg.scrapy.org/scrapy-0.10
2010-10-07 18:23:48 -02:00
Pablo Hoffman
f4accb6c7f
Updated dmoz xpaths of Scrapy tutorial
2010-10-07 18:22:01 -02:00
Pablo Hoffman
7826869cb2
Added missing colon
2010-09-28 16:44:53 -03:00
Martin Santos
0bf9e4627c
added support to CloseSpider extension, for close the spider after N pages have been crawled. Using the CLOSESPIDER_PAGECOUNT setting. closes #253
2010-09-28 16:29:37 -03:00