1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-24 17:24:14 +00:00

763 Commits

Author SHA1 Message Date
Mikhail Korobov
9a999daa2a DOWNLOAD_DELAY docs clarification:
* delay is enforced per website, not per spider;
* document download_delay attribute (it was previously documented only in FAQ about 999 error codes);
* document how CONCURRENT_REQUESTS_PER_IP affects download delays.
2013-12-28 06:30:34 +06:00
Pablo Hoffman
e42e3743fe quick documentation for #475 2013-12-24 12:19:15 -02:00
RasPat1
ff21281b95 Note about selector class import
This is the salient point of this code compared to the last example.  We have a selector now and this is how we use it.  Especially since the user has just come from the shell where the pre-instantiated selector is taken for granted.
2013-12-15 13:46:42 -05:00
Daniel Graña
8a7c5b5d81 Add 0.20.2 release notes
Conflicts:
	docs/news.rst
2013-12-09 18:33:46 -02:00
Pablo Hoffman
f2741c413e fix method name in tutorial. closes GH-480 2013-12-02 13:24:12 -02:00
Daniel Graña
e34ffc0f42 Add 0.20.1 release notes
Conflicts:
	docs/news.rst
2013-11-28 16:25:57 -02:00
Pablo Hoffman
339861367e Merge pull request #425 from audiodude/master
DownloaderMiddleware docs: Update process_request and minor cleanups.
2013-11-25 10:33:35 -08:00
Paul Tremberth
14f5817d6b Modify ItemLoader to support XPath and CSS selectors
Deprecate XPathItemLoader (now an alias to the new ItemLoader)
2013-11-21 18:05:24 +01:00
Pablo Hoffman
f87be371a2 better names for HANDLE_* settings, and added doc 2013-11-21 14:33:17 -02:00
Brian Lange
e4c1d8d37d Elaborate on use of order numbers 2013-11-19 17:51:50 -06:00
Brian Lange
b878f60b5a Add note to item-pipeline documentation explaining order in the ITEM_PIPELINES setting. 2013-11-19 16:12:54 -06:00
Pablo Hoffman
afe6eaa2fe Merge pull request #460 from tntC4stl3/master
duplicate 'use' in line 87
2013-11-15 04:10:49 -08:00
tntC4stl3
b51d5d81e4 duplicate 'use' in line 87 2013-11-15 13:56:44 +08:00
Daniel Graña
04ff7ecebf improve 0.20 release notes
Conflicts:
	docs/news.rst
2013-11-08 17:45:03 -02:00
Daniel Graña
3d18a3c49e bumped version to 0.21.0 2013-11-08 17:09:00 -02:00
Daniel Graña
d0980e5c9b Merge 0.20 release notes 2013-11-08 17:06:10 -02:00
Daniel Graña
2df8156431 Drop Python 2.6 support 2013-10-29 13:44:00 -02:00
Pablo Hoffman
911c8082b0 simplified description of crawl command 2013-10-21 14:42:51 -02:00
Pablo Hoffman
e8ee449a2a Merge pull request #432 from darkrho/crawl-url
Removed URL reference in crawl command and .tld suffix in docs for spider names
2013-10-21 09:40:58 -07:00
Rolando Espinoza La fuente
34543c2b2e DOCS removed .tld suffix for spider names for the sake of consistency. 2013-10-19 23:03:20 -04:00
Daniel Graña
875b07aef8 fix references to old selector naming in docs 2013-10-17 09:33:15 -02:00
Travis Briggs
3043a5ba37 DownloaderMiddleware docs: Update process_request, proper explanation of IgnoreRequest.
Also:
* Change terminology to eliminate uses of terms such as "request middleware" to refer to the process_request methods of installed middleware.
* Remove description of "immediate redirection", as it is misleading.

Further changes.
2013-10-17 00:23:21 +00:00
Mikhail Korobov
086b8a20d4 typo fix in TextResponse docs 2013-10-17 04:50:30 +06:00
Pablo Hoffman
951a9f3f4c Merge pull request #226 from scraperdragon/patch-1
Parameters to Request() in wrong order
2013-10-16 13:15:52 -07:00
Daniel Graña
1461363809 Replace contenttype references by type
The type to choose from is the selector type, not the input type. A
content-type doesn't make sense in this context.
2013-10-16 17:37:25 -02:00
Daniel Graña
155ea08ea1 use sel name for Selector's instances in docs, internals and shell 2013-10-15 15:58:42 -02:00
Daniel Graña
1abb1af0c6 fix typos and wording on selector's introduction 2013-10-15 10:13:43 -02:00
Dragon Dave
a3b711bdea Move callback blob; mention errback 2013-10-15 12:19:42 +01:00
scraperdragon
0ba0d85685 Parameters to Request() in wrong order
Implied that callback wasn't the first optional unnamed parameter.
2013-10-15 11:50:43 +01:00
Daniel Graña
28999590fa update release notes 2013-10-14 16:41:04 -02:00
Daniel Graña
ab9462a251 remove more references to libxml2 2013-10-14 16:37:14 -02:00
Daniel Graña
4645f9e03c Updates docs to reflect unified selectors api 2013-10-14 16:31:20 -02:00
Daniel Graña
bf37f78572 Drop libxml2 selectors backend 2013-10-11 18:02:35 -02:00
Daniel Graña
6d598f0d94 Update selectors docs 2013-10-10 18:24:00 -02:00
Capi Etheriel
bc17e9d412 Adds HtmlCSSSelector and XmlCSSSelector classes, cssselect as optional dependency.
Ported .get() from _Element and .text_content() from HTMLMixin

Add CSS selectors to scrapy shell

Documenting CSS Selectors: Constructing selectors

Documenting CSS Selectors: Using Selectors

Make CSS Selectors a default feature.

Adds XPath powers to CSS Selectors and some syntactic sugar.

Removes methods copied over from lxml.html.HtmlMixin.

Updating docs to use new CSS Selector super powers.

Documenting CSS Selectors: Regular Expressions

Moving section after Nesting section, since it mentions it.

Documenting CSS Selectors: Nesting Selectors

Fix XPath specificity in lxml.selector.CSSSelectorMixin.text

Cleaning up unused stuff from cssel.py

Changing the behavior of lxml.selector.CSSSelectorMixin.text.

Concatenating all of the descendant text nodes is more useful
than returning it in pieces (there's xpath() if you need that).

Documenting CSS Selectors: CSS Selector objects

Documenting CSS Selectors: CSSSelectorList objects

Documenting CSS Selectors: HtmlCSSSelector objects

Documenting CSS Selectors: XmlCSSSelector objects

Fixing some documentations typos and errors

Enforcing the 80-char width lines

Tidying up CSS selectors and CSSSelectorMixin objects

Adding some missing references in documentation.

Fixing lxml.selector.CSSSelectorList.text
2013-10-10 18:23:15 -02:00
Daniel Graña
5b5dd679b0 Add 0.18.4 release notes
Conflicts:
	docs/news.rst
2013-10-10 01:04:35 -02:00
Pablo Hoffman
e1683ddf9b fix doc typo 2013-10-09 17:24:12 -02:00
Pablo Hoffman
8b9526a8f6 Merge pull request #400 from irgmedeiros/patch-2
Update the second code example
2013-10-07 07:57:18 -07:00
Pablo Hoffman
86c6e9433f remove minor reference to 'scrapy server' command 2013-10-04 14:37:55 -03:00
Daniel Graña
aad90ec5a2 Add 0.18.3 release notes
Conflicts:
	docs/news.rst
2013-10-03 12:56:25 -03:00
Pablo Hoffman
37c24e01d7 document bindaddress request meta 2013-10-02 17:13:17 -03:00
Pablo Hoffman
a9c3519897 updated required twisted version to 10.0 2013-10-01 14:07:38 -03:00
Rolando Espinoza
d6e3eae527 docs: added section regarding setting up django's settings. 2013-09-30 09:58:10 -04:00
Rolando Espinoza
0cc1d870db docs: minor tidy up sample code and missing shell prompts. 2013-09-30 09:58:10 -04:00
Loren Davie
8af0e89e85 Corrected typo. 2013-09-29 17:06:46 -04:00
Loren Davie
f49f5724d5 Added dynamic creation of item classes to practices.rst. 2013-09-28 09:00:48 -04:00
irgmedeiros
9b50409986 Update the second code example
Update the second code example to reflect the last change in the first example.
2013-09-27 18:22:33 -03:00
irgmedeiros
d9e0fdc9aa Update practices.rst
With this modification scrapy runs the spider with project settings. The previous example ran only with default settings resulting in ignoring all user settings as pipelines for example.
2013-09-27 17:56:30 -03:00
Daniel Graña
265910aae6 Merge pull request #363 from taikano/sitemap_alternate
also fetch alternate URLs from sitemaps, see #360
2013-09-26 09:15:02 -07:00
Pablo Hoffman
12280c2a95 fix sphinx references in doc 2013-09-25 15:13:17 -03:00