1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-26 14:44:08 +00:00

3297 Commits

Author SHA1 Message Date
Alex Cepoi
0350bed8fd parse command: fix crash in case of bad callback 2012-08-31 16:43:04 +02:00
Pablo Hoffman
4ec99117d3 fixed minor doc typo 2012-08-30 11:56:30 -03:00
Pablo Hoffman
94b40162a9 fixed tests to work on windows 2012-08-30 11:24:29 -03:00
Pablo Hoffman
3904f5f55a more reliable way to set subprocess PYTHONPATH in tests 2012-08-30 10:09:20 -03:00
Pablo Hoffman
4a5f70278f minor tidy up to installation guide windows notes 2012-08-29 15:44:24 -03:00
Pablo Hoffman
098d892c03 simplified installation guide to only mention pip/easy_install mechanism, and provide hints for Windows users 2012-08-29 15:37:05 -03:00
Alex Cepoi
6f1a5d8db6 SEP-017 contracts: various minor changes 2012-08-29 18:31:10 +02:00
Pablo Hoffman
6217f108cc avoid random "Interrupted system call" errors 2012-08-29 11:44:00 -03:00
Pablo Hoffman
8aa46a4b5d removed another usage of scrapy.conf.settings singleton 2012-08-29 11:43:01 -03:00
Pablo Hoffman
280d9d4094 added file missed in previous commit 2012-08-29 11:26:52 -03:00
Pablo Hoffman
70f8e517a1 promoted DjangoItem to main contrib 2012-08-29 11:23:11 -03:00
Pablo Hoffman
52151e8703 fixed bug in FeedExports extension, introduced in previous commit 2012-08-28 19:32:43 -03:00
Pablo Hoffman
da234af456 Merge branch 'singleton_removal' 2012-08-28 18:42:33 -03:00
Pablo Hoffman
babfc6e79b Updated documentation after singleton removal changes.
Also removed some unused code and made some minor additional
refactoring.
2012-08-28 18:35:57 -03:00
Alex Cepoi
901987154e SEP-017 contracts
* load contracts from settings
* refactored contracts manager
* fixed callback bug, which caused responses to be evaluated with a
wrong callback sometimes
* "returns" contract
2012-08-24 16:42:13 +02:00
Pablo Hoffman
c95f717ef2 fixed typo in tests 2012-08-22 13:50:23 -03:00
Pablo Hoffman
5fa5c4545b use ScrapyJSONEncoder JsonItemExporter & JsonLinesItemExporter, to support nested items properly 2012-08-22 13:46:50 -03:00
Pablo Hoffman
1e12c92b8f Removed signals/stats singletons
This change removes singletons for stats collection and signal
dispatching facilities, by making them a member of the Crawler class.

Here are some examples to illustrates the old and new API:

Signals - before:

    from scrapy import signals
    from scrapy.xlib.pydispatch import dispatcher
    dispatcher.connect(self.spider_opened, signals.spider_opened)

Signals - now:

    from scrapy import signals
    crawler.signals.connect(self.spider.opened, signals.spider_opened)

Stats collection - before:

    from scrapy.stats import stats
    stats.inc_value('foo')

Stats collection - now:

    crawler.stats.inc_value('foo')

Backwards compatibility was retained as much as possible and the old API
has been properly flagged with deprecation warnings.
2012-08-21 18:32:30 -03:00
Pablo Hoffman
36f47a4aec Removed per-spider settings concept, and scrapy.conf.settings singleton from many extensions and middlewares. There are some still remaining, that will be removed in future commits 2012-08-21 17:27:45 -03:00
Alex Cepoi
99b76eaa2c SEP-017 contracts: first draft 2012-08-21 02:47:35 +02:00
Daniel Graña
19bcb44c25 Merge pull request #165 from andrix/master
Just a simple refactoring on *MemoryQueue classes that improve the performance
2012-08-10 15:04:30 -07:00
Andrés Moreira
ed8721059d Small improvement on *MemoryQueue performance (mainly on push time) 2012-08-10 10:38:20 -03:00
Daniel Graña
483a70cb5b Merge pull request #160 from midiotthimble/patch-1
Pass into the model only existing fields

test passing OK and changes makes sense
2012-08-09 05:41:50 -07:00
Pablo Hoffman
3bdd9b7b89 added more extensions to ignore on link extractors 2012-08-07 14:23:50 -03:00
Pablo Hoffman
16bee46c70 added ppt to scrapy.linkextractor.IGNORED_EXTENSIONS 2012-08-07 14:18:55 -03:00
Daniel Graña
abcc8c9f63 Recommend pypi as single way to install on Windows 2012-08-06 10:21:13 -03:00
Daniel Graña
b7b0c49520 append parse command to example code sections in docs. closes #162 2012-08-06 09:10:16 -03:00
Vladislav Poluhin
0497ac2fc8 Simple test for default values of model in DjangoItem 2012-07-26 20:55:55 +08:00
Vladislav
e14c41636f Pass into the model only existing fields
Model fields has default values and when field doesn't exists in item container, added `None` instead default value. My patch solves this problem.
2012-07-25 10:10:46 +08:00
Pablo Hoffman
832e45073b fixed typo in stats documentation. closes #159 2012-07-20 17:13:06 -03:00
Pablo Hoffman
831a45009a Merge pull request #155 from warvariuc/import_settings_module_verbose
Spider not found - confusion
2012-07-09 12:39:01 -07:00
Victor Varvariuc
eb91cad0da if you put in settings.py something like import local_settings and local_settings doesn't exist or contains errors you get:
/usr/local/lib/python2.7/dist-packages/Scrapy-0.15.1-py2.7.egg/scrapy/utils/project.py:17: UserWarning: Cannot import scrapy settings module settings
warnings.warn("Cannot import scrapy settings module %s" % scrapy_module)
...
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.15.1-py2.7.egg/scrapy/cmdline.py", line 117, in _run_command
cmd.run(args, opts)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.15.1-py2.7.egg/scrapy/commands/crawl.py", line 43, in run
spider = self.crawler.spiders.create(spname, **opts.spargs)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.15.1-py2.7.egg/scrapy/spidermanager.py", line 43, in create
raise KeyError("Spider not found: %s" % spider_name)
KeyError: 'Spider not found: fb_spider'

Which is not very descriptive.

Now showing details about the exception when ImportError was raises
2012-07-09 08:39:14 +04:00
Pablo Hoffman
99bfd1e590 Merge pull request #154 from vially/patch-1
Fix typo in tutorial
2012-07-04 04:36:16 -07:00
Valentin-Costel Hăloiu
00bfb37e79 Update master 2012-07-04 06:55:01 +03:00
Daniel Graña
a40090a6a7 Merge pull request #151 from msabramo/tox
Add support for tox (http://tox.testrun.org/)
2012-06-29 05:37:24 -07:00
Marc Abramowitz
61952c3be6 Add .tox to .gitignore 2012-06-28 12:43:33 -07:00
Marc Abramowitz
9ae8ea96c4 Add tox.ini for tox (http://tox.testrun.org/) 2012-06-28 12:43:12 -07:00
Daniel Graña
277ed0ae23 Merge pull request #145 from alexcepoi/cookies-changes
domain and path support for request cookies
2012-06-25 11:29:04 -07:00
Pablo Hoffman
8b30575dd3 Merge pull request #147 from tonal/cfg-jsons
Configure json services members
2012-06-25 11:27:13 -07:00
Alexandru Cepoi
177c81745d domain and path support for request cookies 2012-06-25 20:17:59 +02:00
Alexandr N Zamaraev (aka tonal)
cf968c328f Configure json services members 2012-06-25 12:50:48 +07:00
Pablo Hoffman
179e3810dc fixed links to doc. closes #150 2012-06-24 01:00:33 -03:00
Pablo Hoffman
700f20b28f Merge pull request #149 from alexcepoi/parse-changes
documentation for `parse` command, debugging spiders section
2012-06-22 17:50:40 -07:00
Alexandru Cepoi
2e05cf5685 fix small bug with parse command 2012-06-21 20:06:50 +02:00
Alexandru Cepoi
f4faa19e31 added docs topic debugging spiders 2012-06-21 20:03:33 +02:00
Pablo Hoffman
4eeda53b0d Merge pull request #148 from tonal/cfg-launcher
Add launcher class to config
2012-06-19 08:14:58 -07:00
Alexandr N Zamaraev (aka tonal)
3bddedfc6d Add launcher class to config 2012-06-19 22:01:53 +07:00
Daniel Graña
27689009f1 fix urlparse monkeypatches for python 2.7.4. closes #144 2012-06-18 11:50:43 -03:00
Alexandru Cepoi
3e05a2ecf6 update docs for parse command 2012-06-12 18:28:10 +02:00
Pablo Hoffman
e1be9c01bc updated FAQ about bot bans 2012-06-08 18:33:53 -03:00