1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-26 22:24:24 +00:00

3110 Commits

Author SHA1 Message Date
Pablo Hoffman
caa64908d2 added tests for get_func_args() and support for more cases 2012-09-04 18:07:44 -03:00
Pablo Hoffman
fff2871828 added doc section (and FAQ) about spider arguments 2012-09-04 14:49:30 -03:00
Pablo Hoffman
241abcb8aa scrapyd: log errors in API calls to the scrapyd log 2012-09-04 14:22:21 -03:00
Pablo Hoffman
f4a17ec272 removed references to Scrapy Snippets site 2012-09-03 22:19:15 -03:00
Pablo Hoffman
b901e64044 replaced memory usage acounting with (more portable) resource module, removed scrapy.utils.memory module. closes #161 2012-09-03 19:28:16 -03:00
Pablo Hoffman
e2f9daac67 fixed formatting in scrapyd release notes 2012-09-03 16:58:58 -03:00
Pablo Hoffman
dd1398b280 removed another instance of scrapy.conf.settings singleton, this time from download handlers 2012-09-03 16:44:18 -03:00
Daniel Graña
3b768ae989 oops. remove santas debug line 2012-08-31 20:10:32 -03:00
Pablo Hoffman
c6435c5aa7 restored scrapyd log message 2012-08-31 18:49:20 -03:00
Pablo Hoffman
7c8af83138 added issue role to documentation 2012-08-31 18:39:30 -03:00
Pablo Hoffman
3891c20840 Merge pull request #164 from dangra/lazy-log-formatting
format log lines lazily in case they are dropped by loglevels
2012-08-31 14:33:41 -07:00
Daniel Graña
a2d22307ad update news.rst 2012-08-31 17:38:40 -03:00
Daniel Graña
dcef7b03c1 format log lines lazily in case they are dropped by loglevels 2012-08-31 17:14:35 -03:00
Pablo Hoffman
be206ca5ab added process_start_requests method to spider middlewares 2012-08-31 16:41:50 -03:00
Pablo Hoffman
b920674666 corrected minor issue with doc references 2012-08-31 16:41:49 -03:00
Pablo Hoffman
a2d5a304b1 Merge pull request #168 from alexcepoi/parse-fixes
parse command: fix crash in case of bad callback
2012-08-31 09:12:19 -07:00
Alex Cepoi
0350bed8fd parse command: fix crash in case of bad callback 2012-08-31 16:43:04 +02:00
Pablo Hoffman
4ec99117d3 fixed minor doc typo 2012-08-30 11:56:30 -03:00
Pablo Hoffman
94b40162a9 fixed tests to work on windows 2012-08-30 11:24:29 -03:00
Pablo Hoffman
3904f5f55a more reliable way to set subprocess PYTHONPATH in tests 2012-08-30 10:09:20 -03:00
Pablo Hoffman
4a5f70278f minor tidy up to installation guide windows notes 2012-08-29 15:44:24 -03:00
Pablo Hoffman
098d892c03 simplified installation guide to only mention pip/easy_install mechanism, and provide hints for Windows users 2012-08-29 15:37:05 -03:00
Pablo Hoffman
6217f108cc avoid random "Interrupted system call" errors 2012-08-29 11:44:00 -03:00
Pablo Hoffman
8aa46a4b5d removed another usage of scrapy.conf.settings singleton 2012-08-29 11:43:01 -03:00
Pablo Hoffman
280d9d4094 added file missed in previous commit 2012-08-29 11:26:52 -03:00
Pablo Hoffman
70f8e517a1 promoted DjangoItem to main contrib 2012-08-29 11:23:11 -03:00
Pablo Hoffman
52151e8703 fixed bug in FeedExports extension, introduced in previous commit 2012-08-28 19:32:43 -03:00
Pablo Hoffman
da234af456 Merge branch 'singleton_removal' 2012-08-28 18:42:33 -03:00
Pablo Hoffman
babfc6e79b Updated documentation after singleton removal changes.
Also removed some unused code and made some minor additional
refactoring.
2012-08-28 18:35:57 -03:00
Pablo Hoffman
c95f717ef2 fixed typo in tests 2012-08-22 13:50:23 -03:00
Pablo Hoffman
5fa5c4545b use ScrapyJSONEncoder JsonItemExporter & JsonLinesItemExporter, to support nested items properly 2012-08-22 13:46:50 -03:00
Pablo Hoffman
1e12c92b8f Removed signals/stats singletons
This change removes singletons for stats collection and signal
dispatching facilities, by making them a member of the Crawler class.

Here are some examples to illustrates the old and new API:

Signals - before:

    from scrapy import signals
    from scrapy.xlib.pydispatch import dispatcher
    dispatcher.connect(self.spider_opened, signals.spider_opened)

Signals - now:

    from scrapy import signals
    crawler.signals.connect(self.spider.opened, signals.spider_opened)

Stats collection - before:

    from scrapy.stats import stats
    stats.inc_value('foo')

Stats collection - now:

    crawler.stats.inc_value('foo')

Backwards compatibility was retained as much as possible and the old API
has been properly flagged with deprecation warnings.
2012-08-21 18:32:30 -03:00
Pablo Hoffman
36f47a4aec Removed per-spider settings concept, and scrapy.conf.settings singleton from many extensions and middlewares. There are some still remaining, that will be removed in future commits 2012-08-21 17:27:45 -03:00
Daniel Graña
19bcb44c25 Merge pull request #165 from andrix/master
Just a simple refactoring on *MemoryQueue classes that improve the performance
2012-08-10 15:04:30 -07:00
Andrés Moreira
ed8721059d Small improvement on *MemoryQueue performance (mainly on push time) 2012-08-10 10:38:20 -03:00
Daniel Graña
483a70cb5b Merge pull request #160 from midiotthimble/patch-1
Pass into the model only existing fields

test passing OK and changes makes sense
2012-08-09 05:41:50 -07:00
Pablo Hoffman
3bdd9b7b89 added more extensions to ignore on link extractors 2012-08-07 14:23:50 -03:00
Pablo Hoffman
16bee46c70 added ppt to scrapy.linkextractor.IGNORED_EXTENSIONS 2012-08-07 14:18:55 -03:00
Daniel Graña
abcc8c9f63 Recommend pypi as single way to install on Windows 2012-08-06 10:21:13 -03:00
Daniel Graña
b7b0c49520 append parse command to example code sections in docs. closes #162 2012-08-06 09:10:16 -03:00
Vladislav Poluhin
0497ac2fc8 Simple test for default values of model in DjangoItem 2012-07-26 20:55:55 +08:00
Vladislav
e14c41636f Pass into the model only existing fields
Model fields has default values and when field doesn't exists in item container, added `None` instead default value. My patch solves this problem.
2012-07-25 10:10:46 +08:00
Pablo Hoffman
832e45073b fixed typo in stats documentation. closes #159 2012-07-20 17:13:06 -03:00
Pablo Hoffman
831a45009a Merge pull request #155 from warvariuc/import_settings_module_verbose
Spider not found - confusion
2012-07-09 12:39:01 -07:00
Victor Varvariuc
eb91cad0da if you put in settings.py something like import local_settings and local_settings doesn't exist or contains errors you get:
/usr/local/lib/python2.7/dist-packages/Scrapy-0.15.1-py2.7.egg/scrapy/utils/project.py:17: UserWarning: Cannot import scrapy settings module settings
warnings.warn("Cannot import scrapy settings module %s" % scrapy_module)
...
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.15.1-py2.7.egg/scrapy/cmdline.py", line 117, in _run_command
cmd.run(args, opts)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.15.1-py2.7.egg/scrapy/commands/crawl.py", line 43, in run
spider = self.crawler.spiders.create(spname, **opts.spargs)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.15.1-py2.7.egg/scrapy/spidermanager.py", line 43, in create
raise KeyError("Spider not found: %s" % spider_name)
KeyError: 'Spider not found: fb_spider'

Which is not very descriptive.

Now showing details about the exception when ImportError was raises
2012-07-09 08:39:14 +04:00
Pablo Hoffman
99bfd1e590 Merge pull request #154 from vially/patch-1
Fix typo in tutorial
2012-07-04 04:36:16 -07:00
Valentin-Costel Hăloiu
00bfb37e79 Update master 2012-07-04 06:55:01 +03:00
Daniel Graña
a40090a6a7 Merge pull request #151 from msabramo/tox
Add support for tox (http://tox.testrun.org/)
2012-06-29 05:37:24 -07:00
Marc Abramowitz
61952c3be6 Add .tox to .gitignore 2012-06-28 12:43:33 -07:00
Marc Abramowitz
9ae8ea96c4 Add tox.ini for tox (http://tox.testrun.org/) 2012-06-28 12:43:12 -07:00