scrapy

mirror of https://github.com/scrapy/scrapy.git synced 2025-02-28 04:44:21 +00:00

Author	SHA1	Message	Date
Pablo Hoffman	a034d078c8	Changed Debian packaging to use the scrapy version in the package name, so we can have multiple Scrapy versions in the same apt repo --HG-- rename : debian/scrapy.1 => debian/scrapy-files/scrapy.1 rename : debian/000-default => debian/scrapyd-files/000-default rename : debian/scrapyd.upstart => debian/scrapyd.scrapyd.upstart rename : debian/scrapy.1 => extras/scrapy.1	2010-11-17 17:03:00 -02:00
Pablo Hoffman	67adb2a05f	Always use micro versions in Scrapy from now on	2010-11-17 00:09:14 -02:00
Pablo Hoffman	5a5364d0c1	Updated documentation to point out that simplejson is now required if using Python 2.5, and to recommended switching to Python 2.6	2010-11-16 03:31:04 -02:00
Pablo Hoffman	7c712eeda1	Removed scrapy.xlib.simplejson module. Scrapy now requires simplejson if running on Python 2.5. Closes #289	2010-11-16 03:11:12 -02:00
Pablo Hoffman	28cf8625b6	Automated merge with http://hg.scrapy.org/scrapy-0.10	2010-11-16 02:37:19 -02:00
Pablo Hoffman	5ded8251d2	Fixed bug with deferred spider queues. Closes #288	2010-11-16 02:36:18 -02:00
Pablo Hoffman	b3c96c698d	Fixed bug with deploy command if ~/.netrc doesn't exist. Closes #286	2010-11-13 16:38:30 -02:00
Pablo Hoffman	e1f419e9e9	canonicalize_url(): ignore case in domain names	2010-11-12 16:47:36 -02:00
Martin Olveyra	b4cc2d91f4	Allow to reapply a labelled region so to allow to use ignored regions inside repeated variants	2010-11-12 13:30:39 -02:00
Pablo Hoffman	08bbbc2f82	shell: properly refresh all vars when fetching a new request	2010-11-11 18:05:36 -02:00
Pablo Hoffman	5c18f02ade	Only instantiate XPath selectors if the response is of the proper type. Closes #285	2010-11-11 18:00:02 -02:00
Pablo Hoffman	4e800e90d3	Automated merge with http://hg.scrapy.org/scrapy-0.10	2010-11-10 13:36:05 -02:00
Pablo Hoffman	7698ff2a7b	Disabled help() in telnet console. Closes #284	2010-11-10 13:35:36 -02:00
Pablo Hoffman	d988ca1ec2	Some changes to scrapy deploy command: * changed deploy section names to [deploy:target] * project is now passed through a -p\|--project option * version can now be set in the target configuration * switched meaning of -l and -L options * updated documentation accordingly	2010-11-08 17:01:06 -02:00
Pablo Hoffman	37c9d5feff	minor update to queue command doc	2010-11-08 02:19:54 -02:00
Pablo Hoffman	5bdffadbe3	Simplified get_spider_list_from_eggfile() function now that it doesn't need to chdir to a custom directory (Scrapy now works when it's unable to create the SQLite database)	2010-11-05 11:48:12 -02:00
Pablo Hoffman	31bbcc9476	Raise error when egg is corrupt in activate_egg(). Use a more descriptive name for temporary dirs in get_spider_list_from_eggfile(). Make scrapyd webservice pass egg_runner to get_spider_list_from_eggfile()	2010-11-05 11:24:33 -02:00
Pablo Hoffman	de4909faca	get_spider_list_from_eggfile(): more improvements to error messages, and support passing eggruner module as argument	2010-11-04 18:56:11 -02:00
Pablo Hoffman	7ba972d8cf	get_spider_list_from_eggfile(): fail if unable to extract spider list	2010-11-04 16:27:47 -02:00
Pablo Hoffman	369a149d60	fixed bug with deploy command in windows environments: WindowsError: [Error 32] The process cannot access the file because it is being used by another process	2010-11-03 17:16:08 -02:00
Pablo Hoffman	a6c6dfe9ba	Automated merge with http://hg.scrapy.org/scrapy-0.10	2010-11-01 16:10:40 -02:00
Pablo Hoffman	1a2146d11a	Fixed bug detecting response types for some urls. Closes #281	2010-11-01 16:09:33 -02:00
Pablo Hoffman	0f69e7a191	Some changes to HTTP Cache middleware: * made it use the project data storage by default (closes #279) * added HTTPCACHE_ENABLED setting (False by default) to enable it * made HTTPCACHE_DIR = 'httpcache' by default (inside the project data storage) * simplified HTTPCACHE_EXPIRATION_SECS semantics: zero means don't expire, dropped support for negative numbers * other minor doc improvements	2010-11-01 02:38:15 -02:00
Pablo Hoffman	3c94c6cb9b	fixed sphinx doc id	2010-11-01 02:31:20 -02:00
Pablo Hoffman	0b7e815888	Added Didier to AUTHORS	2010-11-01 01:25:30 -02:00
dfdeshom	130276605b	Bind the web server and telnet server to a configurable interface (WEBSERVICE_HOST). The default is to bind to all interfaces. Also add documentation for WEBSERVICE_HOST and TELNETCONSOLE_HOST.	2010-11-01 00:59:04 -02:00
Pablo Hoffman	b76c5c597f	* Added support for project data storage (closes #276 ) * Documented project file structure * Moved default location of SQLite database to project data storage dir (closes #277)	2010-10-31 03:25:37 -02:00
Pablo Hoffman	dfa6745e91	Automated merge with http://hg.scrapy.org/scrapy-0.10	2010-10-30 16:05:53 -02:00
Pablo Hoffman	a0d9b43031	fixed typo in scrapyd doc	2010-10-30 16:05:32 -02:00
Pablo Hoffman	3d96016da1	runtests.sh: switched to 'text' repoter in trial	2010-10-30 16:03:00 -02:00
Pablo Hoffman	f73449fed6	removed LxmlItemLoader, as it has been obsoleted by the new lxml selector backend	2010-10-30 16:02:14 -02:00
Pablo Hoffman	d67152ab0f	Automated merge with http://hg.scrapy.org/scrapy-0.10	2010-10-30 01:56:12 -02:00
Pablo Hoffman	75451cbe84	scrapyd doc: fixed delversion.json example	2010-10-30 01:56:00 -02:00
Pablo Hoffman	20efdc0273	added --egg argument to scrapy deploy command, and log message when building the egg	2010-10-29 16:21:36 -02:00
Pablo Hoffman	85dec82688	Automated merge with http://hg.scrapy.org/scrapy-0.10	2010-10-29 03:42:54 -02:00
Pablo Hoffman	1cc5cba69b	Fixed bug logging Passed items. Closes #274	2010-10-29 03:42:21 -02:00
Pablo Hoffman	836e40896a	minor fixes for python 2.5 compatibility	2010-10-29 02:23:10 -02:00
Pablo Hoffman	22283854d4	avoid stripping trailing spaces on lxml-based selectors. closes #270	2010-10-27 21:39:28 -02:00
Pablo Hoffman	7f646541c3	added trackref stats to memory debugger report. closes #272	2010-10-27 21:18:58 -02:00
Pablo Hoffman	1d5c56089c	Automated merge with http://hg.scrapy.org/scrapy-0.10	2010-10-27 14:42:47 -02:00
Pablo Hoffman	c3e5b4bb03	changed pid file name to scrapyd	2010-10-27 14:42:22 -02:00
Pablo Hoffman	2bba87f69f	Automated merge with http://hg.scrapy.org/scrapy-0.10	2010-10-27 14:20:21 -02:00
Pablo Hoffman	f47b9f608c	simplified lockfile used by scrapyd (/var/run/scrapyd.pid instead of /var/run/scrapyd/scrapyd.pid). closes #271	2010-10-27 14:18:40 -02:00
Daniel Grana	bc2d78406c	MediaPipeline fails to assign crawler.engine.download as download function because crawler is configured after pipelines are loaded	2010-10-27 12:36:24 -02:00
Pablo Hoffman	158f75450b	Automated merge with http://hg.scrapy.org/scrapy-0.10	2010-10-27 09:18:48 -02:00
Pablo Hoffman	6f4be21d4c	changed robots.txt forbidden log level to DEBUG. closes #268	2010-10-27 09:17:58 -02:00
Pablo Hoffman	e625a8d56e	moved all similar selector tests to common selector tests, to reuse them among all backends	2010-10-27 08:54:32 -02:00
Pablo Hoffman	d1f63237ad	refactored selectors tests, by splitting tests in: common tests, lxml-specific tests and libxml2-specific tests. refs #147	2010-10-27 08:37:02 -02:00
Pablo Hoffman	665578bfe8	fixed imports in scrapy.xlib.simplejson	2010-10-27 08:05:08 -02:00
Pablo Hoffman	9b9ab37804	fixed bug with boolean results in lxml-based selectors	2010-10-27 08:03:37 -02:00

... 6 7 8 9 10 ...

2822 Commits