1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-25 02:03:50 +00:00

1991 Commits

Author SHA1 Message Date
Pablo Hoffman
ee08d38ab6 removed deprecated SCRAPYSETTINGS_MODULE environment variable 2009-11-06 16:56:17 -02:00
Pablo Hoffman
49e39bf1ba fixed typo 2009-11-06 16:49:48 -02:00
Pablo Hoffman
4a2a20489d removed deprecated ScrapedItem (previously kept for backwards compatibility) 2009-11-06 16:39:15 -02:00
Pablo Hoffman
791f4932dd added clarification about versioning and api stability 2009-11-06 16:28:51 -02:00
Pablo Hoffman
9bf4e87753 removed deprecated scrapy.xpath module (previously kept for backwards compatibility) 2009-11-06 16:20:38 -02:00
Pablo Hoffman
40646d3cd1 replaced remaining uses of log.msg() 'domain' argument to use 'spider' instead 2009-11-06 16:11:37 -02:00
Pablo Hoffman
fee1b751e5 deprecated 'domain' argument in log.msg() 2009-11-06 16:02:12 -02:00
Pablo Hoffman
74d0e82dbe renamed CloseDomain extension to CloseSpider, and renamed CLOSEDOMAIN_* settings to CLOSESPIDER_*
--HG--
rename : scrapy/contrib/closedomain.py => scrapy/contrib/closespider.py
2009-11-06 15:54:17 -02:00
Pablo Hoffman
cff4592b64 changed label 2009-11-06 15:48:49 -02:00
Pablo Hoffman
919cd5b789 renamed setting CONCURRENT_DOMAINS to CONCURRENT_SPIDERS 2009-11-06 15:44:11 -02:00
Pablo Hoffman
d604dca96d renamed setting REQUESTS_PER_DOMAIN to REQUESTS_PER_SPIDER 2009-11-06 15:42:11 -02:00
Pablo Hoffman
580d82468e fixed images pipeline bug caused by recent api changes 2009-11-06 14:29:37 -02:00
Pablo Hoffman
e5cae1e6c9 renamed setting 2009-11-06 14:19:56 -02:00
Pablo Hoffman
7728a23e99 Changed item pipeline API to pass spider references (instead of domain names) to process_item() method 2009-11-06 13:46:36 -02:00
Pablo Hoffman
a432c1ee40 updated logging doc to include new spider argument in log functions 2009-11-04 14:49:24 -02:00
Pablo Hoffman
69058e3b50 fixed bug in log.msg when receiving new spider argument 2009-11-04 14:45:56 -02:00
Pablo Hoffman
55d584e478 added missing spider argument to log.msg call 2009-11-04 14:45:34 -02:00
Daniel Grana
87e322e568 fixes to lxml link extractors api and encoding handling 2009-11-04 12:39:58 -02:00
Pablo Hoffman
97c322707a * Renamed domain_{opened,closed,idle} signals to spider_{opened,closed,idle}
* Changed them to pass spider instances only (no domains) (refs #105)
2009-11-03 00:39:02 -02:00
Pablo Hoffman
a3f2933912 another minor change in stats test 2009-11-02 20:28:36 -02:00
Pablo Hoffman
65b6065323 minor change in stats test 2009-11-02 19:16:42 -02:00
Pablo Hoffman
4d4f5dca12 Automated merge with http://hg.scrapy.org/scrapy-stable 2009-10-31 14:37:46 -02:00
Pablo Hoffman
d5ae94df55 fixed bug when using log.start() with log level module constants instead of string names, and added regression tests 2009-10-31 14:36:38 -02:00
Pablo Hoffman
904cde6513 added clarification about new dont_click argument of FormRequest.from_response() method 2009-10-29 13:47:10 -02:00
Pablo Hoffman
77b8cfadc7 changed odd logic in if statement 2009-10-29 13:35:32 -02:00
Daniel Grana
b7a00cd2a1 return a generator in CrawlSpider 2009-10-29 13:24:30 -02:00
Ismael Carnales
a244d23b89 added dont_click attr to FormRequest 2009-10-29 13:18:13 -02:00
Pablo Hoffman
b41c5b5d5b fixed typo in intro/install doc (thanks phaithful) 2009-10-29 10:41:20 -02:00
Pablo Hoffman
9b5fef4f48 fixed typo in intro/install doc (thanks phaithful) 2009-10-28 09:34:31 -02:00
Pablo Hoffman
7baff2914a fixed bug caused when instantiating selectors with responses containing empty bodies 2009-10-21 16:37:30 -02:00
Pablo Hoffman
53ae45ad3d restored previous try/except catch all blocks for Libxml2Document __del__() method 2009-10-21 16:25:49 -02:00
Pablo Hoffman
7296a7b889 added DEFAULT_RESPONSE_ENCODING setting 2009-10-21 16:13:41 -02:00
Daniel Grana
3d7a4c890e fix get_meta_refresh bug raised for TextResponses without encoding 2009-10-21 13:57:06 -02:00
Daniel Grana
6405efa507 factorize redirectmw code that handles 302 and meta-refresh redirections
--HG--
extra : rebase_source : f78fee093ac076d4a0630982dc4bade16755fa3b
2009-10-21 12:46:18 -02:00
Ismael Carnales
5371d900a5 corrected the banner from scrapyctl webconsole module 2009-10-21 12:14:41 -02:00
Daniel Grana
0d637a6703 ensure meta refresh redirections use GET requests even if coming from POSTs 2009-10-21 11:35:36 -02:00
Pablo Hoffman
939e302b6e make get_meta_refresh() function more robust and changed interface to return (int, str) as (interval, absolute_url) 2009-10-21 09:37:04 -02:00
Pablo Hoffman
720bc166cf updated new clickdata argument doc 2009-10-20 17:21:56 -02:00
Daniel Grana
6abb3c17ee Improve FormRequest.from_response method to pass click data arguments to ClientForm library 2009-10-20 15:51:41 -02:00
daniel
789cba2bd8 improve csviter memeory usage by use of a generator instead of splitlines 2009-10-15 16:26:05 +01:00
Daniel Grana
d6735b3952 do not modify request.headers inside http download handler 2009-10-15 11:19:52 -02:00
Daniel Grana
cc85c463ae add missing lxml iterators module 2009-10-14 18:37:51 -02:00
Daniel Grana
23c49bcb3f move lxml based xmliter function to contrib_exp 2009-10-14 13:22:28 -02:00
Daniel Grana
6bb84f0797 remove response body references from experimental decompression mw 2009-10-13 16:38:49 -02:00
Daniel Grana
f7fbbfec29 rollback xmliter functions to private, we won\'t support multiples versions of xmliter 2009-10-10 23:31:16 -02:00
Daniel Grana
6db5a37cdb do specialized xmliter public functions and add tests 2009-10-09 17:00:31 -02:00
daniel
51c4be78d7 Add a lxml based xmliter function enabled by default if lxml is available 2009-10-09 04:02:48 +01:00
daniel
af09029649 trackref Libxml2Document objects and do not silence xml exceptions on __del__ 2009-10-08 16:32:42 +01:00
daniel
d7024dcd33 urllib does not support no_proxy in python < 2.6. Skip it tests. 2009-10-08 16:31:37 +01:00
Pablo Hoffman
2712d55cb9 Automated merge with http://hg.scrapy.org/scrapy-stable 2009-10-07 23:58:38 -02:00