reduced introduction text in proposed doc

--HG-- extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40836
2025-02-26 19:03:53 +00:00 · 2009-02-09 15:06:22 +00:00 · 2009-02-09 15:06:22 +00:00 · 844b59b5b3
commit 844b59b5b3
parent d98f35af94
1 changed files with 16 additions and 53 deletions
--- a/scrapy/trunk/docs/proposed/introduction.rst
+++ b/scrapy/trunk/docs/proposed/introduction.rst
@ -12,75 +12,38 @@ Overview
   :height: 468
   :alt: Scrapy architecture

-.. _items:
-
-Items
-----
-
-In Scrapy, Items are the placeholder to use for the scraped data. They are
-represented by a :class:`~scrapy.item.ScrapedItem` object, or any subclass
-instance, and store the information in instance attributes.
-
-.. _request-response:
-
 Requests and Responses
 ----------------------

-Scrapy uses :class:`~scrapy.http.Request` and :class:`~scrapy.http.Response`
-objects for crawling web sites. 
+Scrapy uses *Requests* and *Responses* for crawling web sites. 

-Generally, :class:`~scrapy.http.Request` objects are generated in the
-:ref:`Spiders <spiders>` (although they can be generated in any component of
-the framework), then they pass across the system until they reach the
-Downloader, which actually executes the request and returns a
-:class:`~scrapy.http.Response` object to the :class:`Request's callback
-function <scrapy.http.Request>`.
-
-.. _overview-spiders:
+Generally, *Requests* are generated in the Spiders and pass across the system
+until they reach the *Downloader*, which executes the *Request* and returns a
+*Response* which goes back to the Spider that generated the *Request*.

 Spiders
 -------

-Spiders are user written classes which define how a certain site (or domain)
-will be scraped; including how to crawl the site and how to scrape :ref:`Items
-<items>` from their pages. 
+Spiders are user written classes to scrape information from a domain (or group
+of domains).

-All Spiders must be descendant of :class:`~scrapy.spider.BaseSpider` or any
-subclass of it, in :ref:`ref-spiders` you can see a list of available Spiders
-in Scrapy.
-.. _selectors:
+They define an initial set of URLs (or Requests) to download, how to crawl the
+domain and how to scrape *Items* from their pages.

-Selectors
---------
+Items
+-----

-Selectors are the recommended tool to extract information from documents. They
-retrieve information from the :ref:`Response <request-response>` body using
-`XPath <http://www.w3.org/TR/xpath>`_, a language for finding information in a
-XML document navigating trough its elements and attributes.
+Items are the placeholder to use for the scraped data. They are represented by a
+simple Python class.

-Scrapy defines a class :class:`~scrapy.xpath.XPathSelector`, that comes in two
-flavours, :class:`~scrapy.xpath.HtmlXPatSelector` (for HTML) and
-:class:`~scrapy.xpath.XmlXPathSelector` (for XML). In order to use them you
-must instantiate the desired class with a :ref:`Response <request-response>`
-object.
-
-You can see selectors as objects that represents nodes in the document
-structure. So, the first instantiated selectors are associated to the root
-node, or the entire document.
-
-.. _item-pipeline:
+After an Item has been scraped by a Spider, it is sent to the Item Pipeline for further proccesing.

 Item Pipeline
 -------------

-After an :ref:`Item <items>` has been scraped by a :ref:`Spider <spiders>`, it
-is sent to the Item Pipeline which allows us to perform some actions over the
-:ref:`scrapped Items <items>`.
-
 The Item Pipeline is a list of user written Python classes that implement a
-specific method , which is called sequentially for every element of the
-Pipeline.
+specific method, which is called sequentially for every element of the Pipeline.

 Each element receives the Scraped Item, do an action upon it (like validating,
-checking for duplicates, store the item), and then decide if the Item
-continues trough the Pipeline or the item is dropped.
+checking for duplicates, store the item), and then decide if the Item continues
+trough the Pipeline or the item is dropped.