1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-24 23:04:14 +00:00

Merge pull request #1188 from eliasdorneles/favoring_web_scraping_over_screen_scraping

[MRG+1] Favoring web scraping over screen scraping in the descriptions
This commit is contained in:
Pablo Hoffman 2015-04-29 16:49:43 -03:00
commit 3d2b74a6ff
6 changed files with 9 additions and 11 deletions

View File

@ -17,7 +17,7 @@ Scrapy
Overview
========
Scrapy is a fast high-level web crawling and screen scraping framework, used to
Scrapy is a fast high-level web crawling and web scraping framework, used to
crawl websites and extract structured data from their pages. It can be used for
a wide range of purposes, from data mining to monitoring and automated testing.

4
debian/control vendored
View File

@ -13,8 +13,8 @@ Depends: ${python:Depends}, python-lxml, python-twisted, python-openssl,
Recommends: python-setuptools
Conflicts: python-scrapy, scrapy, scrapy-0.11
Provides: python-scrapy, scrapy
Description: Python web crawling and screen scraping framework
Scrapy is a fast high-level web crawling and screen scraping framework,
Description: Python web crawling and web scraping framework
Scrapy is a fast high-level web crawling and web scraping framework,
used to crawl websites and extract structured data from their pages.
It can be used for a wide range of purposes, from data mining to
monitoring and automated testing.

View File

@ -8,10 +8,9 @@ Scrapy is an application framework for crawling web sites and extracting
structured data which can be used for a wide range of useful applications, like
data mining, information processing or historical archival.
Even though Scrapy was originally designed for `screen scraping`_ (more
precisely, `web scraping`_), it can also be used to extract data using APIs
(such as `Amazon Associates Web Services`_) or as a general purpose web
crawler.
Even though Scrapy was originally designed for `web scraping`_, it can also be
used to extract data using APIs (such as `Amazon Associates Web Services`_) or
as a general purpose web crawler.
Walk-through of an example spider
@ -171,7 +170,6 @@ your code in Scrapy projects and `join the community`_. Thanks for your
interest!
.. _join the community: http://scrapy.org/community/
.. _screen scraping: http://en.wikipedia.org/wiki/Screen_scraping
.. _web scraping: http://en.wikipedia.org/wiki/Web_scraping
.. _Amazon Associates Web Services: http://aws.amazon.com/associates/
.. _Amazon S3: http://aws.amazon.com/s3/

View File

@ -8,7 +8,7 @@ When you're scraping web pages, the most common task you need to perform is
to extract data from the HTML source. There are several libraries available to
achieve this:
* `BeautifulSoup`_ is a very popular screen scraping library among Python
* `BeautifulSoup`_ is a very popular web scraping library among Python
programmers which constructs a Python object based on the structure of the
HTML code and also deals with bad markup reasonably well, but it has one
drawback: it's slow.

View File

@ -1,5 +1,5 @@
"""
Scrapy - a web crawling and screen scraping framework written for Python
Scrapy - a web crawling and web scraping framework written for Python
"""
__all__ = ['__version__', 'version_info', 'optional_features', 'twisted_version',

View File

@ -10,7 +10,7 @@ setup(
name='Scrapy',
version=version,
url='http://scrapy.org',
description='A high-level Web Crawling and Screen Scraping framework',
description='A high-level Web Crawling and Web Scraping framework',
long_description=open('README.rst').read(),
author='Scrapy developers',
maintainer='Pablo Hoffman',