2008-07-30 23:47:28 +00:00
{% extends "base_home.html" %}
{% block main-content %}
2008-12-14 06:29:19 +00:00
< h2 > Welcome to Scrapy< / h2 >
2008-07-30 23:47:28 +00:00
2008-12-16 16:52:08 +00:00
< blockquote > < p >
2008-12-14 06:29:19 +00:00
Scrapy is a high level scraping and web crawling framework for writing
spiders to crawl and parse web pages for all kinds of purposes, from
information retrieval to monitoring or testing web sites.
2008-12-16 16:52:08 +00:00
< / p > < / blockquote >
2008-07-30 23:47:28 +00:00
2008-12-27 19:15:13 +00:00
< h3 > Features< / h3 >
< dl >
< dt > Productive< / dt >
2009-01-02 20:17:40 +00:00
< dd > Just write the rules to extract data from pages and let Scrapy crawl the entire web site for you< / dd >
2008-12-27 19:15:13 +00:00
< dt > Scalable< / dt >
< dd > Scrapy is being used in production to scrape more than 500 sites daily, all in one server< / dd >
< dt > Distributed< dt >
2008-12-27 19:16:12 +00:00
< dd > If you need more processing/bandwith power Scrapy comes bundled with a master/slave cluster that lets you scrape using as many servers as possible< / li >
2008-12-27 19:15:13 +00:00
< dt > Extensible< / dt >
< dd > Scrapy was designed with extensibility in mind and so it provides several mechanisms to plug new code without having to touch the framework core< / li >
< dt > Portable< / dt >
< dd > Scrapy runs on Linux, Windows and Mac< / dd >
< dt > 100% Python< / dt >
< dd > Scrapy is completely written in Python, which makes it very easy to hack it< / dd >
< / dl >
< h3 > Project status< / h3 >
2008-12-14 06:29:19 +00:00
< p >
We're currently preparing the first official release of Scrapy with a
very stable API. At the moment we consider the API quite stable, and
we're writing documentation, tutorials and examples to make it easier
to start using it.
< / p >
2008-12-27 19:15:13 +00:00
< h3 > Where to start?< / h3 >
2008-12-14 06:29:19 +00:00
< p >
2008-12-29 12:10:01 +00:00
Please start by reading < a href = "/docs/" > the documentation< / a > and checking
2008-12-14 06:29:19 +00:00
out the community resources where you can ask for further help while we
2008-12-29 12:10:01 +00:00
finish improving the documentation.
2008-12-27 19:15:13 +00:00
< / p >
2008-07-30 23:47:28 +00:00
{% endblock %}