1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-24 19:44:33 +00:00
Rolando Espinoza La fuente dd477914db spidermanager refactoring
* Implements find/create method in Spider Manager API, removed fromdomain and fromurl

    This method is now in charge of spider resolution, it must return spider object
    from its argument or raise KeyError if no spider is found.

    This method obsoletes from_domain and from_url methods.

    The default implementation of resolve only searches against spider.name, it
    won't use spider.allowed_domains like the old fromdomain. This is the reason
    of why you must supply a spider if you want to crawl an url.

    Find methods returns only available spider names. Not spider instances.
    If no spider found returns empty list.

Affected modules:
    * command.models (force_domain)
        * removed spiders.force_domain
    * each command pass spider to crawl_* commands
    * command.commands.*
        * crawl
            * set spider from opts.spider if arg is url
            * group urls by spider to instance spider just once
        * genspider
            * use spiders.create() to check spider id
        * parse
            * log error if more than one spider found
    * core.manager
        * on crawl_* log message if multiple spiders found for url or request
    * shell
        * prints "Multiple found" if more than one spider found for url or request
        * populate_vars(): added spider keyword parameter

    * contrib.spidermanager:
        * removed fromdomain() & fromurl()
        * new create(spider_id) -> Spider. Raises KeyError if spider not found
        * new find_by_request(request) -> list(spiders)
2010-04-01 17:16:38 -03:00
2010-03-26 14:02:33 -03:00
2010-04-01 17:16:38 -03:00
2009-07-25 15:21:22 -03:00
2010-02-19 23:16:55 -02:00
2009-09-17 02:01:40 -03:00
2009-12-12 15:48:02 -02:00

This is Scrapy, an opensource screen scraping framework written in Python.

For more visit the project home page at http://scrapy.org
Description
Scrapy, a fast high-level web crawling & scraping framework for Python.
Readme BSD-3-Clause 127 MiB
Languages
Python 99.8%
HTML 0.1%