mirror of
https://github.com/scrapy/scrapy.git
synced 2025-02-23 16:03:56 +00:00
Merge pull request #687 from Curita/docs-linkcheck
Docs build and linkcheck in tox env
This commit is contained in:
commit
21bff7b3b3
@ -6,10 +6,12 @@ env:
|
||||
- TOXENV=trunk
|
||||
- TOXENV=pypy
|
||||
- TOXENV=py33
|
||||
- TOXENV=docs
|
||||
matrix:
|
||||
allow_failures:
|
||||
- env: TOXENV=pypy
|
||||
- env: TOXENV=py33
|
||||
- env: TOXENV=docs
|
||||
install:
|
||||
- ./.travis-workarounds.sh
|
||||
- pip install tox
|
||||
|
11
docs/conf.py
11
docs/conf.py
@ -192,3 +192,14 @@ latex_documents = [
|
||||
|
||||
# If false, no module index is generated.
|
||||
#latex_use_modindex = True
|
||||
|
||||
|
||||
# Options for the linkcheck builder
|
||||
# ---------------------------------
|
||||
|
||||
# A list of regular expressions that match URIs that should not be checked when
|
||||
# doing a linkcheck build.
|
||||
linkcheck_ignore = [
|
||||
'http://localhost:\d+', 'http://hg.scrapy.org',
|
||||
'http://directory.google.com/'
|
||||
]
|
||||
|
@ -20,7 +20,7 @@ In other words, comparing `BeautifulSoup`_ (or `lxml`_) to Scrapy is like
|
||||
comparing `jinja2`_ to `Django`_.
|
||||
|
||||
.. _BeautifulSoup: http://www.crummy.com/software/BeautifulSoup/
|
||||
.. _lxml: http://codespeak.net/lxml/
|
||||
.. _lxml: http://lxml.de/
|
||||
.. _jinja2: http://jinja.pocoo.org/2/
|
||||
.. _Django: http://www.djangoproject.com
|
||||
|
||||
|
@ -59,7 +59,7 @@ The next thing is to write a Spider which defines the start URL
|
||||
for extracting the data from pages.
|
||||
|
||||
If we take a look at that page content we'll see that all torrent URLs are like
|
||||
http://www.mininova.org/tor/NUMBER where ``NUMBER`` is an integer. We'll use
|
||||
``http://www.mininova.org/tor/NUMBER`` where ``NUMBER`` is an integer. We'll use
|
||||
that to construct the regular expression for the links to follow: ``/tor/\d+``.
|
||||
|
||||
We'll use `XPath`_ for selecting the data to extract from the web page HTML
|
||||
|
@ -47,7 +47,7 @@ Enhancements
|
||||
|
||||
- [**Backwards incompatible**] Switched HTTPCacheMiddleware backend to filesystem (:issue:`541`)
|
||||
To restore old backend set `HTTPCACHE_STORAGE` to `scrapy.contrib.httpcache.DbmCacheStorage`
|
||||
- Proxy https:// urls using CONNECT method (:issue:`392`, :issue:`397`)
|
||||
- Proxy \https:// urls using CONNECT method (:issue:`392`, :issue:`397`)
|
||||
- Add a middleware to crawl ajax crawleable pages as defined by google (:issue:`343`)
|
||||
- Rename scrapy.spider.BaseSpider to scrapy.spider.Spider (:issue:`510`, :issue:`519`)
|
||||
- Selectors register EXSLT namespaces by default (:issue:`472`)
|
||||
@ -394,7 +394,7 @@ Scrapy changes:
|
||||
- nested items now fully supported in JSON and JSONLines exporters
|
||||
- added :reqmeta:`cookiejar` Request meta key to support multiple cookie sessions per spider
|
||||
- decoupled encoding detection code to `w3lib.encoding`_, and ported Scrapy code to use that mdule
|
||||
- dropped support for Python 2.5. See http://blog.scrapy.org/scrapy-dropping-support-for-python-25
|
||||
- dropped support for Python 2.5. See http://blog.scrapinghub.com/2012/02/27/scrapy-0-15-dropping-support-for-python-2-5/
|
||||
- dropped support for Twisted 2.5
|
||||
- added :setting:`REFERER_ENABLED` setting, to control referer middleware
|
||||
- changed default user agent to: ``Scrapy/VERSION (+http://scrapy.org)``
|
||||
@ -744,7 +744,7 @@ First release of Scrapy.
|
||||
|
||||
.. _AJAX crawleable urls: http://code.google.com/web/ajaxcrawling/docs/getting-started.html
|
||||
.. _chunked transfer encoding: http://en.wikipedia.org/wiki/Chunked_transfer_encoding
|
||||
.. _w3lib: http://https://github.com/scrapy/w3lib
|
||||
.. _w3lib: https://github.com/scrapy/w3lib
|
||||
.. _scrapely: https://github.com/scrapy/scrapely
|
||||
.. _marshal: http://docs.python.org/library/marshal.html
|
||||
.. _w3lib.encoding: https://github.com/scrapy/w3lib/blob/master/w3lib/encoding.py
|
||||
|
@ -121,10 +121,10 @@ for concurrency.
|
||||
For more information about asynchronous programming and Twisted see these
|
||||
links:
|
||||
|
||||
* `Asynchronous Programming with Twisted`_
|
||||
* `Introduction to Deferreds in Twisted`_
|
||||
* `Twisted - hello, asynchronous programming`_
|
||||
|
||||
.. _Twisted: http://twistedmatrix.com/trac/
|
||||
.. _Asynchronous Programming with Twisted: http://twistedmatrix.com/projects/core/documentation/howto/async.html
|
||||
.. _Introduction to Deferreds in Twisted: http://twistedmatrix.com/documents/current/core/howto/defer-intro.html
|
||||
.. _Twisted - hello, asynchronous programming: http://jessenoller.com/2009/02/11/twisted-hello-asynchronous-programming/
|
||||
|
||||
|
@ -499,4 +499,4 @@ Example::
|
||||
|
||||
COMMANDS_MODULE = 'mybot.commands'
|
||||
|
||||
.. _Deploying your project: http://scrapyd.readthedocs.org/en/latest/#deploying-your-project
|
||||
.. _Deploying your project: http://scrapyd.readthedocs.org/en/latest/deploy.html
|
||||
|
@ -15,7 +15,7 @@ simple API for sending attachments and it's very easy to configure, with a few
|
||||
:ref:`settings <topics-email-settings>`.
|
||||
|
||||
.. _smtplib: http://docs.python.org/library/smtplib.html
|
||||
.. _Twisted non-blocking IO: http://twistedmatrix.com/projects/core/documentation/howto/async.html
|
||||
.. _Twisted non-blocking IO: http://twistedmatrix.com/documents/current/core/howto/defer-intro.html
|
||||
|
||||
Quick example
|
||||
=============
|
||||
|
@ -37,7 +37,7 @@ For a complete reference of the selectors API see
|
||||
:ref:`Selector reference <topics-selectors-ref>`
|
||||
|
||||
.. _BeautifulSoup: http://www.crummy.com/software/BeautifulSoup/
|
||||
.. _lxml: http://codespeak.net/lxml/
|
||||
.. _lxml: http://lxml.de/
|
||||
.. _ElementTree: http://docs.python.org/library/xml.etree.elementtree.html
|
||||
.. _cssselect: https://pypi.python.org/pypi/cssselect/
|
||||
.. _XPath: http://www.w3.org/TR/xpath
|
||||
@ -247,12 +247,12 @@ Being built atop `lxml`_, Scrapy selectors also support some `EXSLT`_ extensions
|
||||
and come with these pre-registered namespaces to use in XPath expressions:
|
||||
|
||||
|
||||
====== ==================================== =======================
|
||||
prefix namespace usage
|
||||
====== ==================================== =======================
|
||||
re http://exslt.org/regular-expressions `regular expressions`_
|
||||
set http://exslt.org/sets `set manipulation`_
|
||||
====== ==================================== =======================
|
||||
====== ===================================== =======================
|
||||
prefix namespace usage
|
||||
====== ===================================== =======================
|
||||
re \http://exslt.org/regular-expressions `regular expressions`_
|
||||
set \http://exslt.org/sets `set manipulation`_
|
||||
====== ===================================== =======================
|
||||
|
||||
Regular expressions
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
@ -594,4 +594,4 @@ of relevance, are:
|
||||
case some element names clash between namespaces. These cases are very rare
|
||||
though.
|
||||
|
||||
.. _Google Base XML feed: http://base.google.com/support/bin/answer.py?hl=en&answer=59461
|
||||
.. _Google Base XML feed: https://support.google.com/merchants/answer/160589?hl=en&ref_topic=2473799
|
||||
|
10
tox.ini
10
tox.ini
@ -4,7 +4,7 @@
|
||||
# and then run "tox" from this directory.
|
||||
|
||||
[tox]
|
||||
envlist = py27, pypy, precise, trunk, py33
|
||||
envlist = py27, pypy, precise, trunk, py33, docs
|
||||
indexserver =
|
||||
HPK = https://devpi.net/hpk/dev/
|
||||
|
||||
@ -56,3 +56,11 @@ deps =
|
||||
commands =
|
||||
bin/runtests.bat []
|
||||
sitepackages = False
|
||||
|
||||
[testenv:docs]
|
||||
changedir = docs
|
||||
deps =
|
||||
Sphinx
|
||||
commands =
|
||||
sphinx-build -W -b html . build/html
|
||||
sphinx-build -W -b linkcheck . build/linkcheck
|
||||
|
Loading…
x
Reference in New Issue
Block a user