mirror of
https://github.com/scrapy/scrapy.git
synced 2025-02-24 16:44:19 +00:00
mention crawlera in best practices, as a way to deal with bans
This commit is contained in:
parent
e1f4144391
commit
66311db23e
@ -120,6 +120,9 @@ Here are some tips to keep in mind when dealing with these kind of sites:
|
||||
directly
|
||||
* use a pool of rotating IPs. For example, the free `Tor project`_ or paid
|
||||
services like `ProxyMesh`_
|
||||
* use a highly distributed downloader that circumvents bans internally, so you
|
||||
can just focus on parsing clean pages. One example of such downloaders is
|
||||
`Crawlera`_
|
||||
|
||||
If you are still unable to prevent your bot getting banned, consider contacting
|
||||
`commercial support`_.
|
||||
@ -130,3 +133,4 @@ If you are still unable to prevent your bot getting banned, consider contacting
|
||||
.. _Google cache: http://www.googleguide.com/cached_pages.html
|
||||
.. _testspiders: https://github.com/scrapinghub/testspiders
|
||||
.. _Twisted Reactor Overview: http://twistedmatrix.com/documents/current/core/howto/reactor-basics.html
|
||||
.. _Crawlera: http://crawlera.com
|
||||
|
Loading…
x
Reference in New Issue
Block a user