1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-26 05:43:52 +00:00

2205 Commits

Author SHA1 Message Date
Pablo Hoffman
cf8a085f44 Minor improvement to bash autocompletion 2010-08-22 20:10:11 -03:00
Pablo Hoffman
fd784cd131 Fixed tests on Windows 2010-08-22 19:37:20 -03:00
Pablo Hoffman
33686fa563 better skip test message 2010-08-22 19:08:52 -03:00
Pablo Hoffman
6da1162839 minor fixes to FAQ 2010-08-22 19:08:45 -03:00
Pablo Hoffman
b3753d34eb Added FAQ entry about feed exports 2010-08-22 05:59:30 -03:00
Pablo Hoffman
cbfec4bb0e Renamed webservice ManagerResource to CrawlerResource
--HG--
rename : scrapy/contrib/webservice/manager.py => scrapy/contrib/webservice/crawler.py
2010-08-22 05:48:03 -03:00
Pablo Hoffman
7546a0805c Removed webservice Spiders and Extensions resources since they can now be accessed through the Execution Manager (aka. Crawler) resource 2010-08-22 05:38:46 -03:00
Pablo Hoffman
e5474d8cc6 Made ExtensionManager a subclass of MiddlewareManager 2010-08-22 05:33:08 -03:00
Pablo Hoffman
c1225e0f45 "parse" command refactoring. This fixes #173 and renders #106 invalid. 2010-08-22 05:04:17 -03:00
Pablo Hoffman
9fccc11363 Moved scrapy.extension.extensions singleton to a "extensions" attribute of the scrapy.project.crawler singleton. Refs #189 2010-08-22 02:15:11 -03:00
Pablo Hoffman
52c1e137e5 Moved scrapy.spider.spiders singleton to a "spiders" attribute of the scrapy.project.crawler singleton. Refs #189
Warning: this is a backwards incompatible change.

--HG--
rename : scrapy/spider/models.py => scrapy/spider.py
2010-08-22 02:15:10 -03:00
Pablo Hoffman
faf7a7da83 Moved scrapymanager singleton to scrapy.project module. Refs #189
Detail of changes:

* Moved scrapy.core.manager.ExecutionManager class to scrapy.crawler.Crawler
* Added scrapy.project.crawler singleton to reference a singleton instance of
  Crawler class (previously known as scrapymanager)
* Left an alias scrapy.core.manager.scrapymanager to scrapy.project.crawler for
  backwards compatibility (to be removed in Scrapy 0.11)
2010-08-22 02:10:53 -03:00
Pablo Hoffman
053d45e79f Splitted stats collector classes from stats collection facility (#204)
* moved scrapy.stats.collector.__init__ module to scrapy.statscol
* moved scrapy.stats.collector.simpledb module to scrapy.contrib.statscol
* moved signals from scrapy.stats.signals to scrapy.signals
* moved scrapy/stats/__init__.py to scrapy/stats.py
* updated documentation and tests accordingly

--HG--
rename : scrapy/stats/collector/simpledb.py => scrapy/contrib/statscol.py
rename : scrapy/stats/__init__.py => scrapy/stats.py
rename : scrapy/stats/collector/__init__.py => scrapy/statscol.py
2010-08-22 01:24:07 -03:00
Pablo Hoffman
c276c48c91 Added settings to Scrapy shell variables 2010-08-21 05:10:06 -03:00
Pablo Hoffman
b8b7b5ad74 Improved some commands descriptions 2010-08-21 05:03:38 -03:00
Pablo Hoffman
e6d2a3087e example projects: added scrapy.cfg and removed scrapy-ctl.py 2010-08-21 04:56:19 -03:00
Pablo Hoffman
68f9fcffe8 genspider command refactoring. Also updated tests and doc 2010-08-21 04:46:48 -03:00
Pablo Hoffman
0da6132136 Made command-line too output more concise 2010-08-21 03:37:59 -03:00
Pablo Hoffman
0d9e75c684 Added bash completion for the Scrapy command-line tool. Closes #210 2010-08-21 03:23:45 -03:00
Pablo Hoffman
f80ae9af66 Fixed missing reference to old 'start' command. Refs #209 2010-08-21 01:44:08 -03:00
Pablo Hoffman
50621b7ef3 Renamed command "start" to "runserver". Closes #209
--HG--
rename : scrapy/commands/start.py => scrapy/commands/runserver.py
2010-08-21 01:42:55 -03:00
Pablo Hoffman
9aefa242d5 Applied documentation patch provided by Lucian Ursu (closes #207) 2010-08-21 01:26:35 -03:00
Pablo Hoffman
f782245c5a Removed obsolete files 2010-08-21 01:24:39 -03:00
Pablo Hoffman
1d3b9e2ca8 Scrapy shell refactoring 2010-08-20 11:26:14 -03:00
Pablo Hoffman
7858244dca Scrapy shell: moved python console starting code to scrapy.utils.console and get rid of noisy console banners 2010-08-20 01:33:02 -03:00
Pablo Hoffman
136b0e7450 Minor change to log message 2010-08-19 21:14:27 -03:00
Pablo Hoffman
6dd76ab54b Fixed bug in Scrapy shell which hanged if requests failed to download (#205), added dont_filter=True to requests generated when calling the shell with a url argument, and changed formatting of messages 2010-08-19 21:11:39 -03:00
Pablo Hoffman
30e2404d8f updated FAQ entry to recommend using higher download delays 2010-08-19 17:59:52 -03:00
Pablo Hoffman
66525e77c7 Improved support for scrapy-ctl -> scrapy migration by generating the scrapy.cfg file automatically if it doensn't exist 2010-08-19 17:57:34 -03:00
Pablo Hoffman
3d8151bb26 Added FAQ entry about response code 999 2010-08-19 16:51:51 -03:00
Pablo Hoffman
2ff5a83b7a Added persistent execution queue (based on SQLite), and a new 'queue' command to control it. Closes #198 2010-08-19 02:55:52 -03:00
Pablo Hoffman
9740ad62a6 Added tests for runspider command 2010-08-19 02:30:15 -03:00
Pablo Hoffman
abd7a5e221 Fixed bug with runspider command that appeared after the introduction of new spider manager in r2121. Also factored out common code shared by new spider manager and runspider command, with tests included. 2010-08-19 01:58:10 -03:00
Pablo Hoffman
85f3a4a603 removed obsolete setting 2010-08-19 00:07:14 -03:00
Pablo Hoffman
94ead94bf6 Improved documentation of Scrapy command-line tool
--HG--
rename : docs/topics/cmdline.rst => docs/topics/commands.rst
2010-08-19 00:04:52 -03:00
Pablo Hoffman
34554da201 Deprecated scrapy-ctl.py command in favour of simpler "scrapy" command. Closes #199. Also updated documenation accordingly and added convenient scrapy.bat script for running from Windows.
--HG--
rename : debian/scrapy-ctl.1 => debian/scrapy.1
rename : docs/topics/scrapy-ctl.rst => docs/topics/cmdline.rst
2010-08-18 19:48:32 -03:00
Daniel Grana
5522afca8d pipeline process_item methods decorated with inlineCallbacks fails because of backwards compatible change to support inverted arguments 2010-08-18 13:05:50 -03:00
Pablo Hoffman
50f2236d04 removed hacky command_executed signal 2010-08-17 18:31:16 -03:00
Pablo Hoffman
a71521bfba Default per-command settings are now specified in the default_settings attribute of the command object. Closes #201 2010-08-17 18:30:13 -03:00
Pablo Hoffman
76e19baa29 minor setting fix 2010-08-17 14:48:38 -03:00
Pablo Hoffman
ad3fd0afe8 fixed minor formatting issue with new feed exports doc 2010-08-17 14:37:59 -03:00
Pablo Hoffman
e741a807d2 Added new Feed exports extension with documentation and storage tests. Closes #197.
Also deprecated File export pipeline (to be removed in Scrapy 0.11).

Still need to add tests for FeedExport main extension code.
2010-08-17 14:27:48 -03:00
Pablo Hoffman
d17695ee4b Added "scrapy" command with project settings auto-discovery. Refs #199 2010-08-17 00:59:24 -03:00
Pablo Hoffman
b563e56363 fixed bug in url_is_from_spider() when no allowed_domains class attribute is present 2010-08-16 12:25:51 -03:00
Pablo Hoffman
a28cf29008 Simplified BaseSpider code by removing backwards compatibility code 2010-08-16 10:10:57 -03:00
Pablo Hoffman
3e3a66620b Added support for returning deferreds from (some) signal handlers. Closes #193 2010-08-14 21:10:37 -03:00
Pablo Hoffman
5755bdcddc Improve spider errors logging which were previously logged as confusing "Unhandled errors" - closes #196 2010-08-13 01:45:47 -03:00
Pablo Hoffman
1df2c17b78 updated old documentation references 2010-08-12 20:45:11 -03:00
Pablo Hoffman
43d47e5d9b Some improvements to Item Pipeline (closes #195):
* Made Item Pipeline Manager a subclass of scrapy.middleware.MiddlewareManager
* Added open_spider/close_spider methods with support for returning deferreds from them
* Inverted the process_item() arguments to be more friendly with deferred
  callbacks (backwards compatibility kept through arguments introspection)
* Updated documentation with new methods and process_item() arguments change
2010-08-12 10:48:37 -03:00
Pablo Hoffman
2167bfb20d added prepend_level argument to log._adapt_eventdict() 2010-08-12 10:26:50 -03:00