1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-25 04:43:51 +00:00

2611 Commits

Author SHA1 Message Date
Shane Evans
a1c3fa5dd8 small refactor of image extraction 2011-02-15 15:42:10 -02:00
Pablo Hoffman
1fb55bdaf0 Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12 2011-02-15 07:25:12 -02:00
Pablo Hoffman
16d9a33951 added FAQ entry about working with big data feeds 2011-02-15 07:24:52 -02:00
Ismael Carnales
9b07b0ab0a Fix xmliter_lxml 2011-02-11 11:41:44 -02:00
Pablo Hoffman
874bfa0284 Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12 2011-02-10 17:41:13 -02:00
Pablo Hoffman
3dc677725e Fixed scrapy.utils.python unittests 2011-02-10 17:27:40 -02:00
Pablo Hoffman
e9f3724f1c Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12 2011-02-10 17:12:37 -02:00
Pablo Hoffman
ce8137b738 Replace unknown characters in sgml link extractor, to deal more gracefully with encoding errors in the page. Closes #309 2011-02-10 17:12:03 -02:00
Ismael Carnales
bfc6c3809b Add namespace support to xmliter_lxml 2011-02-09 16:20:48 -02:00
Pablo Hoffman
936353d5f1 Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12 2011-02-09 11:20:46 -02:00
Pablo Hoffman
181d1c09ae Fixed typo and code indentation in the doc. Closes #307 and #308 2011-02-09 11:19:46 -02:00
Pablo Hoffman
c91f0d9ea1 Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12 2011-02-04 13:39:54 -02:00
Pablo Hoffman
c5499ead73 Clarified behaviour when multiple rules match the same link in CrawlSpider 2011-02-04 13:39:12 -02:00
Pablo Hoffman
dde4ccf665 Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12 2011-02-04 13:30:37 -02:00
Pablo Hoffman
65fc2fbd1f Set CONCURRENT_SPIDERS=1 in Scrapyd to force one spider per process 2011-02-04 13:30:01 -02:00
Pablo Hoffman
c4fd7174a6 Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12 2011-01-28 16:24:18 -02:00
Pablo Hoffman
b1c89508f5 fixed wrong changes commited in previous changeset 2011-01-28 16:22:39 -02:00
Pablo Hoffman
4361150494 added missing scrapyd/default_scrapyd.conf file to MANIFEST.in 2011-01-28 16:21:00 -02:00
Pablo Hoffman
632bc27deb added tests for Link object 2011-01-25 19:51:17 -02:00
Shane Evans
c5351d2f48 add __hash__ method to Link object to be compatible with the __eq__ method 2011-01-25 19:23:50 -02:00
Martin Olveyra
32adbea545 handle case when attributes are not separated by space (still recognizable because of quotes) 2011-01-24 18:40:42 -02:00
Pablo Hoffman
18e097d496 Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12 2011-01-13 13:14:57 -02:00
Pablo Hoffman
09f084c220 simplified scrapy shell code after recent changes. refs #306 2011-01-13 13:11:39 -02:00
Pablo Hoffman
0aac226b42 Fixed bug in Scrapy shell's fetch() which wasn't updating local variables properly. Closes #306 2011-01-13 13:08:11 -02:00
Pablo Hoffman
c87fef9c77 Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12 2011-01-11 17:24:33 -02:00
LucianU
0c5f605b0e The xmlfeed.tmpl file didn't use the naming convention specific of the XMLFeedSpider. Namely, it used parse_item (which has been deprecated) instead of parse_node and it didn't show the iterator and itertag attributes. 2011-01-11 17:23:46 -02:00
Pablo Hoffman
1b319a94b4 Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12 2011-01-11 17:04:21 -02:00
Pablo Hoffman
7dc521c56e make scrapyd package depend on specific scrapy version 2011-01-11 17:03:51 -02:00
Pablo Hoffman
7f7a952354 Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12 2011-01-05 16:04:33 -02:00
Pablo Hoffman
048044c1f8 A couple of changes to fix #303:
* improved detection of inside-project environments
* make list command faster (by only instantiating the spider manger)
* print a warning when extensions (middlewares, etc) are disabled with a message on NotConfigured exception
* assert that scrapy configuration hasn't been loaded in scrapyd.runner
* simplified IgnoreRequest exception, to avoid loading settings when importing scrapy.exceptions
* added test to make sure certain modules don't cause scrapy.conf module to be
  loaded, to ensure the scrapyd runner bootstraping performs properly
2011-01-05 15:59:43 -02:00
Pablo Hoffman
48b30ba939 fixed compatibility with python 2.5 and removed unused code 2011-01-05 12:03:54 -02:00
Pablo Hoffman
ebf5ad933e fixed compatibility with python 2.5 and removed unused code 2011-01-05 11:59:19 -02:00
Martin Olveyra
0ba9999cca Handle badformed tags with no trailing > 2011-01-05 11:02:05 -02:00
Pablo Hoffman
579463aff2 make scrapy*-0.13 packages conflict with scrapy*-0.12 packages 2011-01-04 13:57:32 -02:00
Pablo Hoffman
744b146aa3 Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12 2011-01-04 13:57:10 -02:00
Pablo Hoffman
ac8166251e make scrapy*-0.12 packages conflict with scrapy*-0.11 packages 2011-01-04 13:56:28 -02:00
Pablo Hoffman
d7f193cbea bumped version to 0.13 in documentation 2011-01-02 17:29:43 -02:00
Pablo Hoffman
a7c80d0a06 Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12 2011-01-02 17:28:57 -02:00
Pablo Hoffman
b56e933be9 bumped version to 0.12 in documentation 2011-01-02 17:28:33 -02:00
Pablo Hoffman
cd42bd7d0c Bumped version to 0.13 2011-01-02 17:21:31 -02:00
Pablo Hoffman
5879389ad0 Bumped version to 0.12 2011-01-02 16:16:40 -02:00
Vikas Dhiman
fe218abbfd scrapy.utils.memory.get_vmvalue_from_procfs()
causes test case failure on SunOS 5.10 i86pc. Modified it to support SunOS 5.10
2011-01-02 15:46:11 -02:00
Shane Evans
aebe5d5073 make inside_project work with SCRAPY_SETTINGS_MODULE. Closes #300 2010-12-28 15:34:39 -02:00
Pablo Hoffman
3d8b368fc6 scrapyd: use runner from config (if not specified) on get_spider_list() 2010-12-28 11:16:58 -02:00
Pablo Hoffman
fa644f7a5e Some simplifications to Scrapyd architecture and internals:
- launcher no longer knows about egg storage
- removed get_spider_list_from_eggifile() file and replaced by simpler
  get_spider_list() which doesn't receive en egg file as argument
- changed "egg runner" name to just "runner" to reflect the fact that it
  doesn't necesarilly run eggs (though it does in the default case)

--HG--
rename : scrapyd/eggrunner.py => scrapyd/runner.py
2010-12-27 16:22:32 -02:00
Pablo Hoffman
9cd649b3a0 scrapyd: populate SCRAPY_SPIDER and SCRAPY_JOB environment variables 2010-12-26 19:32:56 -02:00
Pablo Hoffman
1c8d74eb5b scrapyd: populate SCRAPY_SLOT environment variable with the scrapyd slot number 2010-12-24 12:47:59 -02:00
Martin Olveyra
d9a3df45c6 Remove deprecated match common prefix feature from IBL code 2010-12-23 14:40:22 -02:00
Pablo Hoffman
633ebc4c43 minor indentation improvement 2010-12-23 13:04:49 -02:00
Pablo Hoffman
db07a9a938 Added notice to documentation, pointing dev to stable versions and viceversa 2010-12-23 13:03:40 -02:00