mirror of
https://github.com/scrapy/scrapy.git
synced 2025-02-23 14:24:19 +00:00
Doc: remove trailing spaces
This commit is contained in:
parent
1be2447af9
commit
a958e54954
@ -24,8 +24,8 @@ For spiders, the scraping cycle goes through something like this:
|
||||
Requests.
|
||||
|
||||
2. In the callback function, you parse the response (web page) and return either
|
||||
dicts with extracted data, :class:`~scrapy.item.Item` objects,
|
||||
:class:`~scrapy.http.Request` objects, or an iterable of these objects.
|
||||
dicts with extracted data, :class:`~scrapy.item.Item` objects,
|
||||
:class:`~scrapy.http.Request` objects, or an iterable of these objects.
|
||||
Those Requests will also contain a callback (maybe
|
||||
the same) and will then be downloaded by Scrapy and then their
|
||||
response handled by the specified callback.
|
||||
@ -56,7 +56,7 @@ scrapy.Spider
|
||||
must inherit (including spiders that come bundled with Scrapy, as well as spiders
|
||||
that you write yourself). It doesn't provide any special functionality. It just
|
||||
provides a default :meth:`start_requests` implementation which sends requests from
|
||||
the :attr:`start_urls` spider attribute and calls the spider's method ``parse``
|
||||
the :attr:`start_urls` spider attribute and calls the spider's method ``parse``
|
||||
for each of the resulting responses.
|
||||
|
||||
.. attribute:: name
|
||||
@ -161,7 +161,7 @@ scrapy.Spider
|
||||
|
||||
class MySpider(scrapy.Spider):
|
||||
name = 'myspider'
|
||||
|
||||
|
||||
def start_requests(self):
|
||||
return [scrapy.FormRequest("http://www.example.com/login",
|
||||
formdata={'user': 'john', 'pass': 'secret'},
|
||||
@ -247,8 +247,8 @@ Return multiple Requests and items from a single callback::
|
||||
|
||||
for url in response.xpath('//a/@href').extract():
|
||||
yield scrapy.Request(url, callback=self.parse)
|
||||
|
||||
Instead of :attr:`~.start_urls` you can use :meth:`~.start_requests` directly;
|
||||
|
||||
Instead of :attr:`~.start_urls` you can use :meth:`~.start_requests` directly;
|
||||
to give data more structure you can use :ref:`topics-items`::
|
||||
|
||||
import scrapy
|
||||
@ -257,7 +257,7 @@ to give data more structure you can use :ref:`topics-items`::
|
||||
class MySpider(scrapy.Spider):
|
||||
name = 'example.com'
|
||||
allowed_domains = ['example.com']
|
||||
|
||||
|
||||
def start_requests(self):
|
||||
yield scrapy.Request('http://www.example.com/1.html', self.parse)
|
||||
yield scrapy.Request('http://www.example.com/2.html', self.parse)
|
||||
@ -269,7 +269,7 @@ to give data more structure you can use :ref:`topics-items`::
|
||||
|
||||
for url in response.xpath('//a/@href').extract():
|
||||
yield scrapy.Request(url, callback=self.parse)
|
||||
|
||||
|
||||
.. _spiderargs:
|
||||
|
||||
Spider arguments
|
||||
@ -301,7 +301,7 @@ Spider arguments can also be passed through the Scrapyd ``schedule.json`` API.
|
||||
See `Scrapyd documentation`_.
|
||||
|
||||
.. _builtin-spiders:
|
||||
|
||||
|
||||
Generic Spiders
|
||||
===============
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user