* removed scrapy.utils.fetch
* each command schedule requests and start scrapy engine
* fetch command instance BaseSpider if given url does not match any spider or match more than one
* parse command schedule url if one spider matches
* parse and fetch doesn't support multiple urls as parameter
* force spider behavior --spider moved from BaseCommand to only commands: fetch, parse, crawl
* Implements find/create method in Spider Manager API, removed fromdomain and fromurl
This method is now in charge of spider resolution, it must return spider object
from its argument or raise KeyError if no spider is found.
This method obsoletes from_domain and from_url methods.
The default implementation of resolve only searches against spider.name, it
won't use spider.allowed_domains like the old fromdomain. This is the reason
of why you must supply a spider if you want to crawl an url.
Find methods returns only available spider names. Not spider instances.
If no spider found returns empty list.
Affected modules:
* command.models (force_domain)
* removed spiders.force_domain
* each command pass spider to crawl_* commands
* command.commands.*
* crawl
* set spider from opts.spider if arg is url
* group urls by spider to instance spider just once
* genspider
* use spiders.create() to check spider id
* parse
* log error if more than one spider found
* core.manager
* on crawl_* log message if multiple spiders found for url or request
* shell
* prints "Multiple found" if more than one spider found for url or request
* populate_vars(): added spider keyword parameter
* contrib.spidermanager:
* removed fromdomain() & fromurl()
* new create(spider_id) -> Spider. Raises KeyError if spider not found
* new find_by_request(request) -> list(spiders)
* added encoding aliases, configurable through a new ENCODING_ALIASES setting
* Response.encoding now returns the real encoding detected for the body
* simplified TextResponse API by removing body_encoding() and
headers_encoding() methods
* Response.encoding now tries to infer the encoding from the body always (it
was done before only on HtmlResponse and TextResponse)
* removed scrapy.utils.encoding.add_encoding_alias() function
* updated implementation of scrapy.utils.response function to reflect these API
changes
* updated documentation to reflect API changes