scrapy

mirror of https://github.com/scrapy/scrapy.git synced 2025-02-24 15:43:48 +00:00

Author	SHA1	Message	Date
Pablo Hoffman	4f28ffcb2c	removed no longer needed dependency on simplejson	2012-04-10 16:01:36 -03:00
Pablo Hoffman	6e8edbd72e	switched default selectors backend to lxml	2012-04-10 15:52:14 -03:00
Pablo Hoffman	27018fced7	changed default user agent to Scrapy/0.15 (+http://scrapy.org ) and removed no longer needed BOT_VERSION setting	2012-03-23 13:45:21 -03:00
Pablo Hoffman	8933e2f2be	added REFERER_ENABLED setting, to control referer middleware	2012-03-22 16:35:14 -03:00
Jason Yeo	da826aa13d	fixed minor mistake in Request objects documentation	2012-03-21 10:25:41 +08:00
Pablo Hoffman	175c70ad44	fixed minor defect in link extractors documentation	2012-03-20 22:56:45 -03:00
Pablo Hoffman	35fb01156e	removed some obsolete remaining code related to sqlite support in scrapy	2012-03-16 11:55:55 -03:00
Pablo Hoffman	b6ae266546	Removed (very old and possibly broken) backwards compatibility support for Twisted 2.5	2012-03-15 00:28:24 -03:00
Pablo Hoffman	e521da2e2f	Dropped support for Python 2.5. See: http://blog.scrapy.org/scrapy-dropping-support-for-python-25	2012-03-01 08:18:12 -02:00
Pablo Hoffman	2b16ebdc11	added minor clarification on cookiejar request meta key usage	2012-02-29 07:19:01 -02:00
lostsnow	5afe4f50c1	scrapyd: support bind to a specific ip address	2012-02-29 13:47:40 +08:00
Pablo Hoffman	81abb45000	fixed bug in new cookiejar documentation	2012-02-28 11:08:25 -02:00
Pablo Hoffman	26c8004125	added documentation for the new cookiejar Request.meta key	2012-02-27 19:58:58 -02:00
Pablo Hoffman	7fe7c3f3b1	MemoryUsage extension: close the spiders (instead of stopping the engine) when the limit is exceeded, providing a descriptive reason for the close. Also fixed default value of MEMUSAGE_ENABLED setting to match the documentation.	2012-02-23 17:05:06 -02:00
Pablo Hoffman	7b8942a648	updated StackTraceDump extension doc	2012-02-16 15:14:17 -02:00
Pablo Hoffman	ea77342b55	updated versioning doc according to recent changes	2012-01-05 11:50:28 -02:00
Pablo Hoffman	0b0bce7f3c	scrapyd: added cancel.json and listjobs.json api methods to documentation	2012-01-05 11:23:25 -02:00
Pablo Hoffman	8f42633a94	scrapyd: added clarification about how to disable items feeds generation	2012-01-05 11:20:50 -02:00
Pablo Hoffman	dbda33efa6	scrapyd: added support for storing items by default Items are stored the same way as logs, in jsonlines format. Also renamed logs_to_keep setting to jobs_to_keep.	2012-01-03 23:08:54 -02:00
Pablo Hoffman	0be421fbf0	fixed reference to tutorial directory	2011-12-23 18:57:11 -02:00
Pablo Hoffman	41fd3c4f6c	doc: removed duplicated callback argument from Request.replace()	2011-12-23 15:55:46 -02:00
Pablo Hoffman	0eeff76227	fixed formatting of scrapyd doc	2011-12-20 03:18:37 -02:00
Daniel Graña	bcb31988f2	change tutorial to follow changes on dmoz site	2011-12-14 13:03:31 -02:00
Pablo Hoffman	992af8d38f	ubuntu repos: added support for oneiric release	2011-10-25 14:26:38 -02:00
Pablo Hoffman	c38c49d56a	fixed PickeItemExporter bug, added unittest, and added pickle to suported feed exports formats	2011-10-25 02:36:51 -02:00
Pablo Hoffman	8bdf288428	made scrapyd doc more version agnostic	2011-10-23 05:29:54 -02:00
Pablo Hoffman	ade5efdc61	added -o option to scrapy crawl, a convenient shortcut for using feed exports	2011-10-22 20:53:49 -02:00
Pablo Hoffman	431441cb52	updated documentation to remove references to old issue tracker and mercurial repos	2011-09-25 13:06:24 -03:00
Pablo Hoffman	ce03ccd4ec	updated documentation about DEPTH_PRIORITY and DFO/BFO crawls	2011-09-23 13:22:25 -03:00
Julien Duponchelle	b7c436343a	scrapy deploy support git version	2011-09-21 22:17:08 +02:00
Pablo Hoffman	ab1c9cfc56	removed documentation header notifying about other documentation versions, as that's provided by readthedocs already	2011-09-14 02:39:32 -03:00
Daniel Grana	5f1b1c05f8	Do not filter requests with dont_filter attribute set in OffsiteMiddleware	2011-09-08 15:18:10 -03:00
Pablo Hoffman	bff3d31469	scrapyd: updated schedule.json response format	2011-09-04 09:29:24 -03:00
Pablo Hoffman	a1dbc62b45	removed CONCURRENT_SPIDERS setting (use scrapyd maxproc instead)	2011-09-02 18:27:39 -03:00
Pablo Hoffman	40f7075f11	added initial documentation about suspend and resume crawls	2011-09-02 13:12:27 -03:00
Pablo Hoffman	27dd68a690	added SpiderState extension	2011-09-02 13:06:59 -03:00
Pablo Hoffman	6a31ab667d	minor fix to doc	2011-09-01 15:08:23 -03:00
Pablo Hoffman	d98b058c21	no longer recommend using labmda's in the doc, as they're not friendly with scheduler persistence	2011-09-01 15:06:49 -03:00
Pablo Hoffman	76af0cdd44	updated documentation and code to use -s instead of --set option	2011-09-01 14:35:37 -03:00
Pablo Hoffman	98b68ca89d	scrapyd: documented support for passing setting to spiders in schedule.json	2011-08-27 01:31:12 -03:00
Pablo Hoffman	5c6b0631e2	minor doc fix	2011-08-19 11:42:03 -03:00
Pablo Hoffman	9d97e73a24	fixed priority handling on the new scheduler so that it's backwards compatible (ie. bigger priorities are higher). also fixed a few documentation bugs related to requests priority	2011-08-19 08:26:41 -03:00
Pablo Hoffman	a3697421c0	some minor updates to documentation	2011-08-11 09:19:59 -03:00
Pablo Hoffman	5da6ffb57b	Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12	2011-08-11 09:11:19 -03:00
Pablo Hoffman	bc2d2183e9	fixed import in doc	2011-08-11 09:11:08 -03:00
Pablo Hoffman	19e6da59d8	added new downloader middleware: ChunkedTransferMiddleware	2011-08-09 03:03:25 -03:00
Pablo Hoffman	984be35461	Some telnet console changes: * renamed manager alias to crawler * added aliases: spider, slot * fixed est() function	2011-08-08 15:01:08 -03:00
Pablo Hoffman	f7c0aeccc6	added note about engine_started signal	2011-08-07 03:57:09 -03:00
Pablo Hoffman	9f60c27612	added setting to support disabling DNS cache: DNSCACHE_ENABLED	2011-08-05 20:41:59 -03:00
Pablo Hoffman	cb95d7a5af	added marshal to formats supported by feed exports	2011-08-03 16:16:48 -03:00

1 2 3 4 5 ...

544 Commits