Ismael Carnales
e694c8ed02
Remove domain references in close spider extension doc
2009-11-30 11:38:56 -02:00
Ismael Carnales
12a7ff7312
Rename Close domain to close spider in extensions doc
2009-11-30 11:36:18 -02:00
Ismael Carnales
8d9cedd88b
Reorder signals doc to respect alphabetical order
2009-11-30 11:29:19 -02:00
Ismael Carnales
93cc3d2715
Correct param formatting in item pipelines doc
2009-11-30 11:04:15 -02:00
Pablo Hoffman
6084be3b2e
added iter_all() function to scrapy.util.trackref module and improved memory leaks documentation. also added a new FAQ antry about memory issues
2009-11-28 16:21:59 -02:00
Pablo Hoffman
f4e93700bd
Automated merge with http://hg.scrapy.org/scrapy-stable/
2009-11-19 10:44:02 -02:00
Pablo Hoffman
c4f77c4da0
minor fixes to images doc (thanks amccloud)
2009-11-16 11:15:25 -02:00
Pablo Hoffman
0d6aee1f12
updated wrong documentation
2009-11-13 20:03:56 -02:00
Pablo Hoffman
aeab5370cb
StatsCollector: ported methods to receive spider instances ( closes #113 ), removed list_domains() method, added iter_spider_stats() method
2009-11-14 20:28:59 -02:00
Pablo Hoffman
c4c6e7c8cd
Automated merge with http://hg.scrapy.org/scrapy-stable/
2009-11-13 20:04:39 -02:00
Pablo Hoffman
07655d05ea
renamed REQUESTS_PER_SPIDER setting to CONCURRENT_REQUESTS_PER_SPIDER
2009-11-13 14:38:22 -02:00
Pablo Hoffman
564abd10ad
Refactored HttpCache middleware:
...
* simplified code
* performance improvements
* removed awkward/unused domain sectorization
* it can now receive Settings on constructor
* added unittests
* added documentation about filesystem storage structure
Also made scrapy.conf.Settings objects instantiable with a dict which is used to override default settings.
2009-11-13 14:25:47 -02:00
Pablo Hoffman
415dec4e16
made offsite middleware log messages when filtering out requests
2009-11-12 10:17:21 -02:00
Pablo Hoffman
74d0e82dbe
renamed CloseDomain extension to CloseSpider, and renamed CLOSEDOMAIN_* settings to CLOSESPIDER_*
...
--HG--
rename : scrapy/contrib/closedomain.py => scrapy/contrib/closespider.py
2009-11-06 15:54:17 -02:00
Pablo Hoffman
919cd5b789
renamed setting CONCURRENT_DOMAINS to CONCURRENT_SPIDERS
2009-11-06 15:44:11 -02:00
Pablo Hoffman
d604dca96d
renamed setting REQUESTS_PER_DOMAIN to REQUESTS_PER_SPIDER
2009-11-06 15:42:11 -02:00
Pablo Hoffman
7728a23e99
Changed item pipeline API to pass spider references (instead of domain names) to process_item() method
2009-11-06 13:46:36 -02:00
Pablo Hoffman
a432c1ee40
updated logging doc to include new spider argument in log functions
2009-11-04 14:49:24 -02:00
Pablo Hoffman
97c322707a
* Renamed domain_{opened,closed,idle} signals to spider_{opened,closed,idle}
...
* Changed them to pass spider instances only (no domains) (refs #105 )
2009-11-03 00:39:02 -02:00
Pablo Hoffman
904cde6513
added clarification about new dont_click argument of FormRequest.from_response() method
2009-10-29 13:47:10 -02:00
Ismael Carnales
a244d23b89
added dont_click attr to FormRequest
2009-10-29 13:18:13 -02:00
Pablo Hoffman
7296a7b889
added DEFAULT_RESPONSE_ENCODING setting
2009-10-21 16:13:41 -02:00
Pablo Hoffman
720bc166cf
updated new clickdata argument doc
2009-10-20 17:21:56 -02:00
Daniel Grana
6abb3c17ee
Improve FormRequest.from_response method to pass click data arguments to ClientForm library
2009-10-20 15:51:41 -02:00
Pablo Hoffman
2712d55cb9
Automated merge with http://hg.scrapy.org/scrapy-stable
2009-10-07 23:58:38 -02:00
Pablo Hoffman
bd481751d8
moved images pipeline documentation to stable doc
...
--HG--
rename : docs/experimental/images.rst => docs/topics/images.rst
2009-10-07 22:57:25 -02:00
Pablo Hoffman
b4d202a6b0
added note about memory usage extension not working on windows
2009-10-07 22:57:10 -02:00
Pablo Hoffman
937acd91d1
improved documentation of http proxy middleware
2009-10-07 21:00:34 -02:00
Daniel Grana
bc64ca3e13
Add support to set http proxies per request, and obey enviroment variables http_proxy and no_proxy by default.
2009-10-05 04:10:22 -02:00
Daniel Grana
8aa7d153ae
rewrote of downloader handlers
...
* add REQUEST_HANDLERS setting with defaults for file, http and https schemes
* add documentation of new setting
* add unittests for all the builtin handlers
* remove unused getPage function
2009-10-05 04:10:22 -02:00
Ismael Carnales
5862ba7db7
modified doc to reflect the new spider callback return policy (lists not needed)
2009-09-22 11:25:40 -03:00
Pablo Hoffman
132557dd14
some deployment changes in preparation for the 0.7.0 release candidate
2009-09-16 22:40:36 -03:00
Ismael Carnales
fd41f06056
added doc on how to enable an Item Pipeline component
2009-09-16 14:19:16 -03:00
Ismael Carnales
404e7e09d7
changed spider doc references in BaseSpider class
2009-09-16 14:10:11 -03:00
Daniel Grana
062730cbd8
fix csv exporter documentation
2009-09-16 00:17:50 -03:00
Pablo Hoffman
56b292e057
XmlItemExporter: added built-in support for exporting multi-valued fields (for convenience)
2009-09-14 22:05:52 -03:00
Pablo Hoffman
921fc4f3bf
Big Scrapy core refactoring to pass around spider references instead of domains.
...
This is to avoid accessing the scrapy.spider.spiders singleton for "resolving"
spiders, which is considered an "evil" practice because it ties us to the
singleton model for the spider resolver, which is a bad thing.
This change will also work as the foundation for the API cleaning that we'll
perform for 0.8. We decided to introduce this change now to have a more common
basecode between 0.7 and 0.8, which will allow us to better support 0.7 until
0.8 is released.
However, this change doesn't modify the stable/documented API, nor does it
change the core logic. Those changes will land on the 0.8 branch, after 0.7 is
released.
--HG--
rename : scrapy/contrib/domainsch.py => scrapy/contrib/spiderscheduler.py
2009-09-12 14:34:18 -03:00
Ismael Carnales
3998a0cb58
added more scheduler middleware documentation, and moved it to experimental
...
--HG--
rename : docs/topics/scheduler-middleware.rst => docs/experimental/scheduler-middleware.rst
2009-09-11 11:58:53 -03:00
Pablo Hoffman
f1bb8dc2a3
first cleanup of spider manager api
...
- removed asdict() and reload() methods
- added list() method
- removed default spider
2009-09-10 19:06:46 -03:00
Pablo Hoffman
269724a2b7
added Debugger extension, removed StackTraceDump from extensions available by default
2009-09-08 22:32:17 -03:00
Ismael Carnales
4ddfa9a2a3
stlyed downloaded middleware doc
2009-09-07 12:18:57 -03:00
Ismael Carnales
e3df11e5bb
added module directive to spidermw documentation
2009-09-07 12:03:24 -03:00
Pablo Hoffman
827aa19c6e
removed obsolete scrapy.utils.db module
2009-09-04 17:38:14 -03:00
Pablo Hoffman
861a803cc3
removed obsolete RestrictMiddleware
2009-09-04 17:22:56 -03:00
Ismael Carnales
7e2587169b
added missing middleware docs
2009-09-04 12:39:02 -03:00
Pablo Hoffman
aefb94063a
more updates to spider middleware doc
2009-09-04 13:46:04 -03:00
Pablo Hoffman
d04640be5c
some improvements to spider middleware doc
2009-09-04 13:29:16 -03:00
Pablo Hoffman
96bb223c13
removed (pretty useless) DebugMiddleware
2009-09-04 12:59:58 -03:00
Daniel Grana
0e7b2a6da5
write header line by default when using csv exporter
...
--HG--
extra : rebase_source : 2d2d7153dde5e3f77e682e16d2e4408f732f234e
2009-09-03 13:58:39 -03:00
Pablo Hoffman
596d2c4479
moved CoreStats extension to scrapy.contrib.corestats
...
--HG--
rename : scrapy/stats/corestats.py => scrapy/contrib/corestats.py
2009-09-01 23:00:49 -03:00