mirror of
https://github.com/scrapy/scrapy.git
synced 2025-02-22 18:03:51 +00:00
Update documentation links
This commit is contained in:
parent
9f4fe5dc4a
commit
5876b9aa30
@ -120,7 +120,7 @@ Scrapy Contrib
|
||||
==============
|
||||
|
||||
Scrapy contrib shares a similar rationale as Django contrib, which is explained
|
||||
in `this post <http://jacobian.org/writing/what-is-django-contrib/>`_. If you
|
||||
in `this post <https://jacobian.org/writing/what-is-django-contrib/>`_. If you
|
||||
are working on a new functionality, please follow that rationale to decide
|
||||
whether it should be a Scrapy contrib. If unsure, you can ask in
|
||||
`scrapy-users`_.
|
||||
@ -189,7 +189,7 @@ And their unit-tests are in::
|
||||
|
||||
.. _issue tracker: https://github.com/scrapy/scrapy/issues
|
||||
.. _scrapy-users: https://groups.google.com/forum/#!forum/scrapy-users
|
||||
.. _Twisted unit-testing framework: http://twistedmatrix.com/documents/current/core/development/policy/test-standard.html
|
||||
.. _Twisted unit-testing framework: https://twistedmatrix.com/documents/current/core/development/policy/test-standard.html
|
||||
.. _AUTHORS: https://github.com/scrapy/scrapy/blob/master/AUTHORS
|
||||
.. _tests/: https://github.com/scrapy/scrapy/tree/master/tests
|
||||
.. _open issues: https://github.com/scrapy/scrapy/issues
|
||||
|
14
docs/faq.rst
14
docs/faq.rst
@ -77,8 +77,8 @@ Scrapy crashes with: ImportError: No module named win32api
|
||||
|
||||
You need to install `pywin32`_ because of `this Twisted bug`_.
|
||||
|
||||
.. _pywin32: http://sourceforge.net/projects/pywin32/
|
||||
.. _this Twisted bug: http://twistedmatrix.com/trac/ticket/3707
|
||||
.. _pywin32: https://sourceforge.net/projects/pywin32/
|
||||
.. _this Twisted bug: https://twistedmatrix.com/trac/ticket/3707
|
||||
|
||||
How can I simulate a user login in my spider?
|
||||
---------------------------------------------
|
||||
@ -123,7 +123,7 @@ Why does Scrapy download pages in English instead of my native language?
|
||||
Try changing the default `Accept-Language`_ request header by overriding the
|
||||
:setting:`DEFAULT_REQUEST_HEADERS` setting.
|
||||
|
||||
.. _Accept-Language: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.4
|
||||
.. _Accept-Language: https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.4
|
||||
|
||||
Where can I find some example Scrapy projects?
|
||||
----------------------------------------------
|
||||
@ -282,7 +282,7 @@ I'm scraping a XML document and my XPath selector doesn't return any items
|
||||
|
||||
You may need to remove namespaces. See :ref:`removing-namespaces`.
|
||||
|
||||
.. _user agents: http://en.wikipedia.org/wiki/User_agent
|
||||
.. _LIFO: http://en.wikipedia.org/wiki/LIFO
|
||||
.. _DFO order: http://en.wikipedia.org/wiki/Depth-first_search
|
||||
.. _BFO order: http://en.wikipedia.org/wiki/Breadth-first_search
|
||||
.. _user agents: https://en.wikipedia.org/wiki/User_agent
|
||||
.. _LIFO: https://en.wikipedia.org/wiki/LIFO
|
||||
.. _DFO order: https://en.wikipedia.org/wiki/Depth-first_search
|
||||
.. _BFO order: https://en.wikipedia.org/wiki/Breadth-first_search
|
||||
|
@ -74,7 +74,7 @@ Windows
|
||||
Be sure you download the architecture (win32 or amd64) that matches your system
|
||||
|
||||
* *(Only required for Python<2.7.9)* Install `pip`_ from
|
||||
https://pip.pypa.io/en/latest/installing.html
|
||||
https://pip.pypa.io/en/latest/installing/
|
||||
|
||||
Now open a Command prompt to check ``pip`` is installed correctly::
|
||||
|
||||
@ -171,9 +171,9 @@ After any of these workarounds you should be able to install Scrapy::
|
||||
pip install Scrapy
|
||||
|
||||
.. _Python: https://www.python.org/
|
||||
.. _pip: https://pip.pypa.io/en/latest/installing.html
|
||||
.. _easy_install: http://pypi.python.org/pypi/setuptools
|
||||
.. _Control Panel: http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/sysdm_advancd_environmnt_addchange_variable.mspx
|
||||
.. _pip: https://pip.pypa.io/en/latest/installing/
|
||||
.. _easy_install: https://pypi.python.org/pypi/setuptools
|
||||
.. _Control Panel: https://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/sysdm_advancd_environmnt_addchange_variable.mspx
|
||||
.. _lxml: http://lxml.de/
|
||||
.. _OpenSSL: https://pypi.python.org/pypi/pyOpenSSL
|
||||
.. _setuptools: https://pypi.python.org/pypi/setuptools
|
||||
|
@ -170,7 +170,7 @@ your code in Scrapy projects and `join the community`_. Thanks for your
|
||||
interest!
|
||||
|
||||
.. _join the community: http://scrapy.org/community/
|
||||
.. _web scraping: http://en.wikipedia.org/wiki/Web_scraping
|
||||
.. _Amazon Associates Web Services: http://aws.amazon.com/associates/
|
||||
.. _Amazon S3: http://aws.amazon.com/s3/
|
||||
.. _web scraping: https://en.wikipedia.org/wiki/Web_scraping
|
||||
.. _Amazon Associates Web Services: https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html
|
||||
.. _Amazon S3: https://aws.amazon.com/s3/
|
||||
.. _Sitemaps: http://www.sitemaps.org
|
||||
|
@ -7,7 +7,7 @@ Scrapy Tutorial
|
||||
In this tutorial, we'll assume that Scrapy is already installed on your system.
|
||||
If that's not the case, see :ref:`intro-install`.
|
||||
|
||||
We are going to use `Open directory project (dmoz) <http://www.dmoz.org/>`_ as
|
||||
We are going to use `Open directory project (dmoz) <https://www.dmoz.org/>`_ as
|
||||
our example domain to scrape.
|
||||
|
||||
This tutorial will walk you through these tasks:
|
||||
@ -191,8 +191,8 @@ based on `XPath`_ or `CSS`_ expressions called :ref:`Scrapy Selectors
|
||||
<topics-selectors>`. For more information about selectors and other extraction
|
||||
mechanisms see the :ref:`Selectors documentation <topics-selectors>`.
|
||||
|
||||
.. _XPath: http://www.w3.org/TR/xpath
|
||||
.. _CSS: http://www.w3.org/TR/selectors
|
||||
.. _XPath: https://www.w3.org/TR/xpath
|
||||
.. _CSS: https://www.w3.org/TR/selectors
|
||||
|
||||
Here are some examples of XPath expressions and their meanings:
|
||||
|
||||
@ -544,5 +544,5 @@ Then, we recommend you continue by playing with an example project (see
|
||||
:ref:`intro-examples`), and then continue with the section
|
||||
:ref:`section-basics`.
|
||||
|
||||
.. _JSON: http://en.wikipedia.org/wiki/JSON
|
||||
.. _JSON: https://en.wikipedia.org/wiki/JSON
|
||||
.. _dirbot: https://github.com/scrapy/dirbot
|
||||
|
@ -403,10 +403,11 @@ Outsourced packages
|
||||
| | :ref:`topics-deploy`) |
|
||||
+-------------------------------------+-------------------------------------+
|
||||
| scrapy.contrib.djangoitem | `scrapy-djangoitem <https://github. |
|
||||
| | com/scrapy/scrapy-djangoitem>`_ |
|
||||
| | com/scrapy-plugins/scrapy-djangoite |
|
||||
| | m>`_ |
|
||||
+-------------------------------------+-------------------------------------+
|
||||
| scrapy.webservice | `scrapy-jsonrpc <https://github.com |
|
||||
| | /scrapy/scrapy-jsonrpc>`_ |
|
||||
| | /scrapy-plugins/scrapy-jsonrpc>`_ |
|
||||
+-------------------------------------+-------------------------------------+
|
||||
|
||||
`scrapy.contrib_exp` and `scrapy.contrib` dissolutions
|
||||
@ -1186,7 +1187,7 @@ Scrapy changes:
|
||||
- nested items now fully supported in JSON and JSONLines exporters
|
||||
- added :reqmeta:`cookiejar` Request meta key to support multiple cookie sessions per spider
|
||||
- decoupled encoding detection code to `w3lib.encoding`_, and ported Scrapy code to use that module
|
||||
- dropped support for Python 2.5. See http://blog.scrapinghub.com/2012/02/27/scrapy-0-15-dropping-support-for-python-2-5/
|
||||
- dropped support for Python 2.5. See https://blog.scrapinghub.com/2012/02/27/scrapy-0-15-dropping-support-for-python-2-5/
|
||||
- dropped support for Twisted 2.5
|
||||
- added :setting:`REFERER_ENABLED` setting, to control referer middleware
|
||||
- changed default user agent to: ``Scrapy/VERSION (+http://scrapy.org)``
|
||||
@ -1535,7 +1536,7 @@ First release of Scrapy.
|
||||
|
||||
|
||||
.. _AJAX crawleable urls: https://developers.google.com/webmasters/ajax-crawling/docs/getting-started?csw=1
|
||||
.. _chunked transfer encoding: http://en.wikipedia.org/wiki/Chunked_transfer_encoding
|
||||
.. _chunked transfer encoding: https://en.wikipedia.org/wiki/Chunked_transfer_encoding
|
||||
.. _w3lib: https://github.com/scrapy/w3lib
|
||||
.. _scrapely: https://github.com/scrapy/scrapely
|
||||
.. _marshal: https://docs.python.org/2/library/marshal.html
|
||||
|
@ -271,4 +271,4 @@ class (which they all inherit from).
|
||||
Close the given spider. After this is called, no more specific stats
|
||||
can be accessed or collected.
|
||||
|
||||
.. _reactor: http://twistedmatrix.com/documents/current/core/howto/reactor-basics.html
|
||||
.. _reactor: https://twistedmatrix.com/documents/current/core/howto/reactor-basics.html
|
||||
|
@ -125,8 +125,8 @@ links:
|
||||
* `Twisted - hello, asynchronous programming`_
|
||||
* `Twisted Introduction - Krondo`_
|
||||
|
||||
.. _Twisted: http://twistedmatrix.com/trac/
|
||||
.. _Introduction to Deferreds in Twisted: http://twistedmatrix.com/documents/current/core/howto/defer-intro.html
|
||||
.. _Twisted: https://twistedmatrix.com/trac/
|
||||
.. _Introduction to Deferreds in Twisted: https://twistedmatrix.com/documents/current/core/howto/defer-intro.html
|
||||
.. _Twisted - hello, asynchronous programming: http://jessenoller.com/2009/02/11/twisted-hello-asynchronous-programming/
|
||||
.. _Twisted Introduction - Krondo: http://krondo.com/blog/?page_id=1327/
|
||||
.. _Twisted Introduction - Krondo: http://krondo.com/an-introduction-to-asynchronous-programming-and-twisted/
|
||||
|
||||
|
@ -10,4 +10,4 @@ DjangoItem has been moved into a separate project.
|
||||
|
||||
It is hosted at:
|
||||
|
||||
https://github.com/scrapy/scrapy-djangoitem
|
||||
https://github.com/scrapy-plugins/scrapy-djangoitem
|
||||
|
@ -300,7 +300,7 @@ HttpAuthMiddleware
|
||||
|
||||
# .. rest of the spider code omitted ...
|
||||
|
||||
.. _Basic access authentication: http://en.wikipedia.org/wiki/Basic_access_authentication
|
||||
.. _Basic access authentication: https://en.wikipedia.org/wiki/Basic_access_authentication
|
||||
|
||||
|
||||
HttpCacheMiddleware
|
||||
@ -390,9 +390,9 @@ what is implemented:
|
||||
|
||||
what is missing:
|
||||
|
||||
* `Pragma: no-cache` support http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.1
|
||||
* `Vary` header support http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.6
|
||||
* Invalidation after updates or deletes http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.10
|
||||
* `Pragma: no-cache` support https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.1
|
||||
* `Vary` header support https://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.6
|
||||
* Invalidation after updates or deletes https://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.10
|
||||
* ... probably others ..
|
||||
|
||||
In order to use this policy, set:
|
||||
@ -464,7 +464,7 @@ In order to use this storage backend:
|
||||
* set :setting:`HTTPCACHE_STORAGE` to ``scrapy.extensions.httpcache.LeveldbCacheStorage``
|
||||
* install `LevelDB python bindings`_ like ``pip install leveldb``
|
||||
|
||||
.. _LevelDB: http://code.google.com/p/leveldb/
|
||||
.. _LevelDB: https://github.com/google/leveldb
|
||||
.. _leveldb python bindings: https://pypi.python.org/pypi/leveldb
|
||||
|
||||
|
||||
@ -964,6 +964,6 @@ Default: ``"latin-1"``
|
||||
The default encoding for proxy authentication on :class:`HttpProxyMiddleware`.
|
||||
|
||||
|
||||
.. _DBM: http://en.wikipedia.org/wiki/Dbm
|
||||
.. _DBM: https://en.wikipedia.org/wiki/Dbm
|
||||
.. _anydbm: https://docs.python.org/2/library/anydbm.html
|
||||
.. _chunked transfer encoding: http://en.wikipedia.org/wiki/Chunked_transfer_encoding
|
||||
.. _chunked transfer encoding: https://en.wikipedia.org/wiki/Chunked_transfer_encoding
|
||||
|
@ -15,7 +15,7 @@ simple API for sending attachments and it's very easy to configure, with a few
|
||||
:ref:`settings <topics-email-settings>`.
|
||||
|
||||
.. _smtplib: https://docs.python.org/2/library/smtplib.html
|
||||
.. _Twisted non-blocking IO: http://twistedmatrix.com/documents/current/core/howto/defer-intro.html
|
||||
.. _Twisted non-blocking IO: https://twistedmatrix.com/documents/current/core/howto/defer-intro.html
|
||||
|
||||
Quick example
|
||||
=============
|
||||
|
@ -21,7 +21,7 @@ avoid collision with existing (and future) extensions. For example, a
|
||||
hypothetic extension to handle `Google Sitemaps`_ would use settings like
|
||||
`GOOGLESITEMAP_ENABLED`, `GOOGLESITEMAP_DEPTH`, and so on.
|
||||
|
||||
.. _Google Sitemaps: http://en.wikipedia.org/wiki/Sitemaps
|
||||
.. _Google Sitemaps: https://en.wikipedia.org/wiki/Sitemaps
|
||||
|
||||
Loading & activating extensions
|
||||
===============================
|
||||
@ -355,8 +355,8 @@ There are at least two ways to send Scrapy the `SIGQUIT`_ signal:
|
||||
|
||||
kill -QUIT <pid>
|
||||
|
||||
.. _SIGUSR2: http://en.wikipedia.org/wiki/SIGUSR1_and_SIGUSR2
|
||||
.. _SIGQUIT: http://en.wikipedia.org/wiki/SIGQUIT
|
||||
.. _SIGUSR2: https://en.wikipedia.org/wiki/SIGUSR1_and_SIGUSR2
|
||||
.. _SIGQUIT: https://en.wikipedia.org/wiki/SIGQUIT
|
||||
|
||||
Debugger extension
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
@ -330,7 +330,7 @@ format in :setting:`FEED_EXPORTERS`. E.g., to disable the built-in CSV exporter
|
||||
'csv': None,
|
||||
}
|
||||
|
||||
.. _URI: http://en.wikipedia.org/wiki/Uniform_Resource_Identifier
|
||||
.. _Amazon S3: http://aws.amazon.com/s3/
|
||||
.. _URI: https://en.wikipedia.org/wiki/Uniform_Resource_Identifier
|
||||
.. _Amazon S3: https://aws.amazon.com/s3/
|
||||
.. _boto: https://github.com/boto/boto
|
||||
.. _botocore: https://github.com/boto/botocore
|
||||
|
@ -164,4 +164,4 @@ elements.
|
||||
or tags which Therefer in page HTML
|
||||
sources may on Firebug inspects the live DOM
|
||||
|
||||
.. _has been shut down by Google: http://searchenginewatch.com/sew/news/2096661/google-directory-shut
|
||||
.. _has been shut down by Google: https://searchenginewatch.com/sew/news/2096661/google-directory-shut
|
||||
|
@ -160,8 +160,8 @@ method and how to clean up the resources properly.
|
||||
self.db[self.collection_name].insert(dict(item))
|
||||
return item
|
||||
|
||||
.. _MongoDB: http://www.mongodb.org/
|
||||
.. _pymongo: http://api.mongodb.org/python/current/
|
||||
.. _MongoDB: https://www.mongodb.org/
|
||||
.. _pymongo: https://api.mongodb.org/python/current/
|
||||
|
||||
Duplicates filter
|
||||
-----------------
|
||||
|
@ -143,7 +143,7 @@ Supported Storage
|
||||
File system is currently the only officially supported storage, but there is
|
||||
also (undocumented) support for storing files in `Amazon S3`_.
|
||||
|
||||
.. _Amazon S3: http://aws.amazon.com/s3/
|
||||
.. _Amazon S3: https://aws.amazon.com/s3/
|
||||
|
||||
File system storage
|
||||
-------------------
|
||||
@ -223,7 +223,7 @@ Where:
|
||||
|
||||
* ``<image_id>`` is the `SHA1 hash`_ of the image url
|
||||
|
||||
.. _SHA1 hash: http://en.wikipedia.org/wiki/SHA_hash_functions
|
||||
.. _SHA1 hash: https://en.wikipedia.org/wiki/SHA_hash_functions
|
||||
|
||||
Example of image files stored using ``small`` and ``big`` thumbnail names::
|
||||
|
||||
@ -390,5 +390,5 @@ above::
|
||||
item['image_paths'] = image_paths
|
||||
return item
|
||||
|
||||
.. _Twisted Failure: http://twistedmatrix.com/documents/current/api/twisted.python.failure.Failure.html
|
||||
.. _MD5 hash: http://en.wikipedia.org/wiki/MD5
|
||||
.. _Twisted Failure: https://twistedmatrix.com/documents/current/api/twisted.python.failure.Failure.html
|
||||
.. _MD5 hash: https://en.wikipedia.org/wiki/MD5
|
||||
|
@ -251,5 +251,5 @@ If you are still unable to prevent your bot getting banned, consider contacting
|
||||
.. _ProxyMesh: http://proxymesh.com/
|
||||
.. _Google cache: http://www.googleguide.com/cached_pages.html
|
||||
.. _testspiders: https://github.com/scrapinghub/testspiders
|
||||
.. _Twisted Reactor Overview: http://twistedmatrix.com/documents/current/core/howto/reactor-basics.html
|
||||
.. _Twisted Reactor Overview: https://twistedmatrix.com/documents/current/core/howto/reactor-basics.html
|
||||
.. _Crawlera: http://scrapinghub.com/crawlera
|
||||
|
@ -621,4 +621,4 @@ XmlResponse objects
|
||||
adds encoding auto-discovering support by looking into the XML declaration
|
||||
line. See :attr:`TextResponse.encoding`.
|
||||
|
||||
.. _Twisted Failure: http://twistedmatrix.com/documents/current/api/twisted.python.failure.Failure.html
|
||||
.. _Twisted Failure: https://twistedmatrix.com/documents/current/api/twisted.python.failure.Failure.html
|
||||
|
@ -40,8 +40,8 @@ For a complete reference of the selectors API see
|
||||
.. _lxml: http://lxml.de/
|
||||
.. _ElementTree: https://docs.python.org/2/library/xml.etree.elementtree.html
|
||||
.. _cssselect: https://pypi.python.org/pypi/cssselect/
|
||||
.. _XPath: http://www.w3.org/TR/xpath
|
||||
.. _CSS: http://www.w3.org/TR/selectors
|
||||
.. _XPath: https://www.w3.org/TR/xpath
|
||||
.. _CSS: https://www.w3.org/TR/selectors
|
||||
|
||||
|
||||
Using selectors
|
||||
@ -281,7 +281,7 @@ Another common case would be to extract all direct ``<p>`` children::
|
||||
For more details about relative XPaths see the `Location Paths`_ section in the
|
||||
XPath specification.
|
||||
|
||||
.. _Location Paths: http://www.w3.org/TR/xpath#location-paths
|
||||
.. _Location Paths: https://www.w3.org/TR/xpath#location-paths
|
||||
|
||||
Using EXSLT extensions
|
||||
----------------------
|
||||
@ -439,7 +439,7 @@ you may want to take a look first at this `XPath tutorial`_.
|
||||
|
||||
|
||||
.. _`XPath tutorial`: http://www.zvon.org/comp/r/tut-XPath_1.html
|
||||
.. _`this post from ScrapingHub's blog`: http://blog.scrapinghub.com/2014/07/17/xpath-tips-from-the-web-scraping-trenches/
|
||||
.. _`this post from ScrapingHub's blog`: https://blog.scrapinghub.com/2014/07/17/xpath-tips-from-the-web-scraping-trenches/
|
||||
|
||||
|
||||
Using text nodes in a condition
|
||||
@ -481,7 +481,7 @@ But using the ``.`` to mean the node, works::
|
||||
>>> sel.xpath("//a[contains(., 'Next Page')]").extract()
|
||||
[u'<a href="#">Click here to go to the <strong>Next Page</strong></a>']
|
||||
|
||||
.. _`XPath string function`: http://www.w3.org/TR/xpath/#section-String-Functions
|
||||
.. _`XPath string function`: https://www.w3.org/TR/xpath/#section-String-Functions
|
||||
|
||||
Beware of the difference between //node[1] and (//node)[1]
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
@ -1202,6 +1202,6 @@ case to see how to enable and use them.
|
||||
.. settingslist::
|
||||
|
||||
|
||||
.. _Amazon web services: http://aws.amazon.com/
|
||||
.. _breadth-first order: http://en.wikipedia.org/wiki/Breadth-first_search
|
||||
.. _depth-first order: http://en.wikipedia.org/wiki/Depth-first_search
|
||||
.. _Amazon web services: https://aws.amazon.com/
|
||||
.. _breadth-first order: https://en.wikipedia.org/wiki/Breadth-first_search
|
||||
.. _depth-first order: https://en.wikipedia.org/wiki/Depth-first_search
|
||||
|
@ -138,7 +138,7 @@ Example of shell session
|
||||
========================
|
||||
|
||||
Here's an example of a typical shell session where we start by scraping the
|
||||
http://scrapy.org page, and then proceed to scrape the http://reddit.com
|
||||
http://scrapy.org page, and then proceed to scrape the https://reddit.com
|
||||
page. Finally, we modify the (Reddit) request method to POST and re-fetch it
|
||||
getting an error. We end the session by typing Ctrl-D (in Unix systems) or
|
||||
Ctrl-Z in Windows.
|
||||
|
@ -22,7 +22,7 @@ Deferred signal handlers
|
||||
Some signals support returning `Twisted deferreds`_ from their handlers, see
|
||||
the :ref:`topics-signals-ref` below to know which ones.
|
||||
|
||||
.. _Twisted deferreds: http://twistedmatrix.com/documents/current/core/howto/defer.html
|
||||
.. _Twisted deferreds: https://twistedmatrix.com/documents/current/core/howto/defer.html
|
||||
|
||||
.. _topics-signals-ref:
|
||||
|
||||
@ -258,4 +258,4 @@ response_downloaded
|
||||
:param spider: the spider for which the response is intended
|
||||
:type spider: :class:`~scrapy.spiders.Spider` object
|
||||
|
||||
.. _Failure: http://twistedmatrix.com/documents/current/api/twisted.python.failure.Failure.html
|
||||
.. _Failure: https://twistedmatrix.com/documents/current/api/twisted.python.failure.Failure.html
|
||||
|
@ -211,7 +211,7 @@ HttpErrorMiddleware
|
||||
According to the `HTTP standard`_, successful responses are those whose
|
||||
status codes are in the 200-300 range.
|
||||
|
||||
.. _HTTP standard: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
|
||||
.. _HTTP standard: https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
|
||||
|
||||
If you still want to process response codes outside that range, you can
|
||||
specify which response codes the spider is able to handle using the
|
||||
@ -238,7 +238,7 @@ responses, unless you really know what you're doing.
|
||||
|
||||
For more information see: `HTTP Status Code Definitions`_.
|
||||
|
||||
.. _HTTP Status Code Definitions: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
|
||||
.. _HTTP Status Code Definitions: https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
|
||||
|
||||
HttpErrorMiddleware settings
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
@ -735,5 +735,5 @@ Combine SitemapSpider with other sources of urls::
|
||||
.. _Sitemaps: http://www.sitemaps.org
|
||||
.. _Sitemap index files: http://www.sitemaps.org/protocol.html#index
|
||||
.. _robots.txt: http://www.robotstxt.org/
|
||||
.. _TLD: http://en.wikipedia.org/wiki/Top-level_domain
|
||||
.. _TLD: https://en.wikipedia.org/wiki/Top-level_domain
|
||||
.. _Scrapyd documentation: http://scrapyd.readthedocs.org/en/latest/
|
||||
|
@ -8,4 +8,4 @@ webservice has been moved into a separate project.
|
||||
|
||||
It is hosted at:
|
||||
|
||||
https://github.com/scrapy/scrapy-jsonrpc
|
||||
https://github.com/scrapy-plugins/scrapy-jsonrpc
|
||||
|
@ -36,5 +36,5 @@ new methods or functionality but the existing methods should keep working the
|
||||
same way.
|
||||
|
||||
|
||||
.. _odd-numbered versions for development releases: http://en.wikipedia.org/wiki/Software_versioning#Odd-numbered_versions_for_development_releases
|
||||
.. _odd-numbered versions for development releases: https://en.wikipedia.org/wiki/Software_versioning#Odd-numbered_versions_for_development_releases
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user