1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-23 22:44:03 +00:00

57 Commits

Author SHA1 Message Date
Pablo Hoffman
b76c5c597f * Added support for project data storage (closes #276)
* Documented project file structure
* Moved default location of SQLite database to project data storage dir (closes #277)
2010-10-31 03:25:37 -02:00
Pablo Hoffman
9599bde3e9 Removed RequestLimitMiddleware 2010-09-22 16:09:13 -03:00
Pablo Hoffman
ed4aec187f Ported code to use new unified access to spider settings, keeping backwards compatibility for old spider attributes. Refs #245 2010-09-22 16:09:13 -03:00
Pablo Hoffman
b6c2b55e5b Splitted settings classes from settings singleton. Closes #244
--HG--
rename : scrapy/conf/__init__.py => scrapy/conf.py
rename : scrapy/conf/default_settings.py => scrapy/settings/default_settings.py
rename : scrapy/tests/test_conf.py => scrapy/tests/test_settings.py
2010-09-22 15:47:33 -03:00
Pablo Hoffman
766f2d910d Renamed Request Handlers to Download Handlers 2010-09-05 19:35:53 -03:00
Pablo Hoffman
6bf52fb50e Make telnet console and web service try a range of ports for binding, instead of just one. Closes #226 2010-09-05 06:48:08 -03:00
Pablo Hoffman
14e985b076 Updated Command line tool documentation 2010-09-05 05:29:58 -03:00
Pablo Hoffman
1190f97944 Updated settings documentation 2010-09-05 04:58:14 -03:00
Pablo Hoffman
053d45e79f Splitted stats collector classes from stats collection facility (#204)
* moved scrapy.stats.collector.__init__ module to scrapy.statscol
* moved scrapy.stats.collector.simpledb module to scrapy.contrib.statscol
* moved signals from scrapy.stats.signals to scrapy.signals
* moved scrapy/stats/__init__.py to scrapy/stats.py
* updated documentation and tests accordingly

--HG--
rename : scrapy/stats/collector/simpledb.py => scrapy/contrib/statscol.py
rename : scrapy/stats/__init__.py => scrapy/stats.py
rename : scrapy/stats/collector/__init__.py => scrapy/statscol.py
2010-08-22 01:24:07 -03:00
Pablo Hoffman
9aefa242d5 Applied documentation patch provided by Lucian Ursu (closes #207) 2010-08-21 01:26:35 -03:00
Pablo Hoffman
94ead94bf6 Improved documentation of Scrapy command-line tool
--HG--
rename : docs/topics/cmdline.rst => docs/topics/commands.rst
2010-08-19 00:04:52 -03:00
Pablo Hoffman
34554da201 Deprecated scrapy-ctl.py command in favour of simpler "scrapy" command. Closes #199. Also updated documenation accordingly and added convenient scrapy.bat script for running from Windows.
--HG--
rename : debian/scrapy-ctl.1 => debian/scrapy.1
rename : docs/topics/scrapy-ctl.rst => docs/topics/cmdline.rst
2010-08-18 19:48:32 -03:00
Pablo Hoffman
a71521bfba Default per-command settings are now specified in the default_settings attribute of the command object. Closes #201 2010-08-17 18:30:13 -03:00
Pablo Hoffman
e741a807d2 Added new Feed exports extension with documentation and storage tests. Closes #197.
Also deprecated File export pipeline (to be removed in Scrapy 0.11).

Still need to add tests for FeedExport main extension code.
2010-08-17 14:27:48 -03:00
Pablo Hoffman
1df2c17b78 updated old documentation references 2010-08-12 20:45:11 -03:00
Pablo Hoffman
bd16d1cd48 Added SMTP-AUTH support to scrapy.mail (closes #149) 2010-06-13 17:14:46 -03:00
Pablo Hoffman
6a33d6c4d0 * Added Scrapy Web Service with documentation and tests.
* Marked Web Console as deprecated.
* Removed Web Console documentation to discourage its use.
2010-06-09 13:46:22 -03:00
Pablo Hoffman
031eb1e5ed removed no longer used SpiderScheduler (obsoleted by ExecutionQueue) 2010-05-28 17:27:15 -03:00
Ismael Carnales
a71dc295af Some mail improvements and tests.
* Add mail_sent signal and use it in MailSender
* Add MAIL_DEBUG setting to not send mails when testing
* Add MailSender tests
2010-05-28 16:51:47 -03:00
Daniel Grana
68a875edb0 update ENCODING_ALIASES setting default value in settings documentation topic 2010-04-07 10:54:54 -03:00
Pablo Hoffman
2299deda66 updated wrong link in doc 2010-03-26 14:02:33 -03:00
Pablo Hoffman
1330697c3d Some improvements to Response encoding support:
* added encoding aliases, configurable through a new ENCODING_ALIASES setting
* Response.encoding now returns the real encoding detected for the body
* simplified TextResponse API by removing body_encoding() and
  headers_encoding() methods
* Response.encoding now tries to infer the encoding from the body always (it
  was done before only on HtmlResponse and TextResponse)
* removed scrapy.utils.encoding.add_encoding_alias() function
* updated implementation of scrapy.utils.response function to reflect these API
  changes
* updated documentation to reflect API changes
2010-03-25 15:47:10 -03:00
Pablo Hoffman
9ddcd1095d sort setting alphabetically 2010-03-25 11:45:06 -03:00
Pablo Hoffman
4fa833c849 Added LOG_ENCODING setting 2010-03-24 12:13:38 -03:00
Pablo Hoffman
d12cd22d5e switched default scheduler order to DFO, which consumes less memory by default 2010-03-04 10:15:58 -02:00
Pablo Hoffman
c1f8198639 Added RANDOMIZE_DOWNLOAD_DELAY setting 2010-02-19 21:53:18 -02:00
Pablo Hoffman
57d60eae39 sort settings doc alphabetically by setting name 2010-01-31 18:11:13 -02:00
Pablo Hoffman
08eeaf98a2 fixed description of LOG_STDOUT setting 2010-01-13 15:51:08 -02:00
Pablo Hoffman
07655d05ea renamed REQUESTS_PER_SPIDER setting to CONCURRENT_REQUESTS_PER_SPIDER 2009-11-13 14:38:22 -02:00
Pablo Hoffman
564abd10ad Refactored HttpCache middleware:
* simplified code
* performance improvements
* removed awkward/unused domain sectorization
* it can now receive Settings on constructor
* added unittests
* added documentation about filesystem storage structure

Also made scrapy.conf.Settings objects instantiable with a dict which is used to override default settings.
2009-11-13 14:25:47 -02:00
Pablo Hoffman
919cd5b789 renamed setting CONCURRENT_DOMAINS to CONCURRENT_SPIDERS 2009-11-06 15:44:11 -02:00
Pablo Hoffman
d604dca96d renamed setting REQUESTS_PER_DOMAIN to REQUESTS_PER_SPIDER 2009-11-06 15:42:11 -02:00
Pablo Hoffman
7296a7b889 added DEFAULT_RESPONSE_ENCODING setting 2009-10-21 16:13:41 -02:00
Pablo Hoffman
937acd91d1 improved documentation of http proxy middleware 2009-10-07 21:00:34 -02:00
Daniel Grana
8aa7d153ae rewrote of downloader handlers
* add REQUEST_HANDLERS setting with defaults for file, http and https schemes
* add documentation of new setting
* add unittests for all the builtin handlers
* remove unused getPage function
2009-10-05 04:10:22 -02:00
Pablo Hoffman
921fc4f3bf Big Scrapy core refactoring to pass around spider references instead of domains.
This is to avoid accessing the scrapy.spider.spiders singleton for "resolving"
spiders, which is considered an "evil" practice because it ties us to the
singleton model for the spider resolver, which is a bad thing.

This change will also work as the foundation for the API cleaning that we'll
perform for 0.8. We decided to introduce this change now to have a more common
basecode between 0.7 and 0.8, which will allow us to better support 0.7 until
0.8 is released.

However, this change doesn't modify the stable/documented API, nor does it
change the core logic. Those changes will land on the 0.8 branch, after 0.7 is
released.

--HG--
rename : scrapy/contrib/domainsch.py => scrapy/contrib/spiderscheduler.py
2009-09-12 14:34:18 -03:00
Pablo Hoffman
f1bb8dc2a3 first cleanup of spider manager api
- removed asdict() and reload() methods
- added list() method
- removed default spider
2009-09-10 19:06:46 -03:00
Pablo Hoffman
269724a2b7 added Debugger extension, removed StackTraceDump from extensions available by default 2009-09-08 22:32:17 -03:00
Pablo Hoffman
827aa19c6e removed obsolete scrapy.utils.db module 2009-09-04 17:38:14 -03:00
Pablo Hoffman
861a803cc3 removed obsolete RestrictMiddleware 2009-09-04 17:22:56 -03:00
Pablo Hoffman
596d2c4479 moved CoreStats extension to scrapy.contrib.corestats
--HG--
rename : scrapy/stats/corestats.py => scrapy/contrib/corestats.py
2009-09-01 23:00:49 -03:00
Pablo Hoffman
6a50af05d7 removed useless SpiderReloader extension 2009-09-01 22:49:15 -03:00
Pablo Hoffman
79851aefa6 moved SpiderProfiler extension to scrapy.contrib_exp and removed references from documentation
--HG--
rename : scrapy/contrib/spider/profiler.py => scrapy/contrib_exp/spiderprofiler.py
2009-09-01 22:38:37 -03:00
Pablo Hoffman
18fd635124 another doc typo 2009-09-01 12:52:40 -03:00
Pablo Hoffman
538cc9803a fixed doc typo 2009-09-01 12:47:53 -03:00
Pablo Hoffman
895c70e036 doc: fixed some links to scrapy-ctl topic 2009-08-29 18:23:55 -03:00
Pablo Hoffman
64572124e0 added doc about SCRAPY_SETTINGS_MODULE 2009-08-29 18:20:13 -03:00
Pablo Hoffman
60cbf24c89 more cleanups to startproject and project templates
--HG--
rename : scrapy/templates/project/root/scrapy-ctl.py => scrapy/templates/project/scrapy-ctl.py
2009-08-29 04:29:47 -03:00
Pablo Hoffman
8a074c9cb5 removed scrapy-admin.py command, and left only scrapy-ctl as the only scrapy command 2009-08-24 15:43:36 -03:00
Ismael Carnales
85282a4b76 added scrapy commandline scripts doc 2009-08-24 12:02:44 -03:00