scrapy

mirror of https://github.com/scrapy/scrapy.git synced 2025-02-25 22:24:33 +00:00

Author	SHA1	Message	Date
Daniel Graña	b9bb9bed6b	Merge pull request #343 from kmike/ajax-crawlable [MRG] AjaxCrawlableMiddleware	2014-01-16 05:07:39 -08:00
Paul Tremberth	827c0cf51f	Rename "regexp" prefix to "re"	2014-01-15 15:00:25 +01:00
Paul Tremberth	88c8a523a7	Add warning in docs on performance when using EXSLT regexp functions	2014-01-15 12:52:10 +01:00
Paul Tremberth	a3eba68aca	Drop EXSLT strings and math extensions	2014-01-15 12:28:25 +01:00
Pablo Hoffman	ea2f897b81	Merge pull request #502 from scrapy/doc-fixes DOWNLOAD_DELAY docs clarification	2014-01-14 21:07:42 -08:00
Paul Tremberth	2cc26e6f56	Fix typo error	2014-01-14 13:09:18 +01:00
Paul Tremberth	29fc9f3466	Update selectors documentation and tests	2014-01-14 12:56:37 +01:00
Hobson Lane	6ba0857a5c	documentation code example correction corrections per pablohoffman	2014-01-10 10:37:27 -08:00
malcolm m	962e5ef702	Clarify return value from extract_links	2014-01-05 14:42:48 -08:00
Yuri Prezument	060891c01c	Remove unused import from code sample Item pipeline docs - removed unused import from code sample	2014-01-03 15:44:17 +02:00
Mikhail Korobov	a27d91f0a6	Rename BaseSpider to Spider. See GH-495.	2013-12-30 19:46:41 +06:00
Mikhail Korobov	e713733edf	minor fixes to scrapy shell docs * better IPython links; * MDC link instead of w3schools; * small formatting fixes; * show quoted URL in example	2013-12-30 10:27:39 +06:00
Mikhail Korobov	9a999daa2a	DOWNLOAD_DELAY docs clarification: * delay is enforced per website, not per spider; * document download_delay attribute (it was previously documented only in FAQ about 999 error codes); * document how CONCURRENT_REQUESTS_PER_IP affects download delays.	2013-12-28 06:30:34 +06:00
Pablo Hoffman	e42e3743fe	quick documentation for #475	2013-12-24 12:19:15 -02:00
Mikhail Korobov	e0cebbfc8f	add a remark about 1%	2013-12-20 23:12:37 +06:00
Mikhail Korobov	943a0bd264	AjaxCrawlableMiddleware in Broad Crawl docs	2013-12-19 01:01:26 +06:00
Mikhail Korobov	a87b3bd1c8	AjaxCrawlableMiddleware	2013-12-19 00:06:47 +06:00
Pablo Hoffman	339861367e	Merge pull request #425 from audiodude/master DownloaderMiddleware docs: Update process_request and minor cleanups.	2013-11-25 10:33:35 -08:00
Paul Tremberth	14f5817d6b	Modify ItemLoader to support XPath and CSS selectors Deprecate XPathItemLoader (now an alias to the new ItemLoader)	2013-11-21 18:05:24 +01:00
Pablo Hoffman	f87be371a2	better names for HANDLE_* settings, and added doc	2013-11-21 14:33:17 -02:00
Brian Lange	e4c1d8d37d	Elaborate on use of order numbers	2013-11-19 17:51:50 -06:00
Brian Lange	b878f60b5a	Add note to item-pipeline documentation explaining order in the ITEM_PIPELINES setting.	2013-11-19 16:12:54 -06:00
tntC4stl3	b51d5d81e4	duplicate 'use' in line 87	2013-11-15 13:56:44 +08:00
Daniel Graña	2df8156431	Drop Python 2.6 support	2013-10-29 13:44:00 -02:00
Pablo Hoffman	911c8082b0	simplified description of crawl command	2013-10-21 14:42:51 -02:00
Pablo Hoffman	e8ee449a2a	Merge pull request #432 from darkrho/crawl-url Removed URL reference in crawl command and .tld suffix in docs for spider names	2013-10-21 09:40:58 -07:00
Rolando Espinoza La fuente	34543c2b2e	DOCS removed .tld suffix for spider names for the sake of consistency.	2013-10-19 23:03:20 -04:00
Daniel Graña	875b07aef8	fix references to old selector naming in docs	2013-10-17 09:33:15 -02:00
Travis Briggs	3043a5ba37	DownloaderMiddleware docs: Update process_request, proper explanation of IgnoreRequest. Also: * Change terminology to eliminate uses of terms such as "request middleware" to refer to the process_request methods of installed middleware. * Remove description of "immediate redirection", as it is misleading. Further changes.	2013-10-17 00:23:21 +00:00
Mikhail Korobov	086b8a20d4	typo fix in TextResponse docs	2013-10-17 04:50:30 +06:00
Pablo Hoffman	951a9f3f4c	Merge pull request #226 from scraperdragon/patch-1 Parameters to Request() in wrong order	2013-10-16 13:15:52 -07:00
Daniel Graña	1461363809	Replace `contenttype` references by `type` The type to choose from is the selector type, not the input type. A content-type doesn't make sense in this context.	2013-10-16 17:37:25 -02:00
Daniel Graña	155ea08ea1	use `sel` name for Selector's instances in docs, internals and shell	2013-10-15 15:58:42 -02:00
Dragon Dave	a3b711bdea	Move callback blob; mention errback	2013-10-15 12:19:42 +01:00
scraperdragon	0ba0d85685	Parameters to Request() in wrong order Implied that callback wasn't the first optional unnamed parameter.	2013-10-15 11:50:43 +01:00
Daniel Graña	ab9462a251	remove more references to libxml2	2013-10-14 16:37:14 -02:00
Daniel Graña	4645f9e03c	Updates docs to reflect unified selectors api	2013-10-14 16:31:20 -02:00
Daniel Graña	bf37f78572	Drop libxml2 selectors backend	2013-10-11 18:02:35 -02:00
Daniel Graña	6d598f0d94	Update selectors docs	2013-10-10 18:24:00 -02:00
Capi Etheriel	bc17e9d412	Adds HtmlCSSSelector and XmlCSSSelector classes, cssselect as optional dependency. Ported .get() from _Element and .text_content() from HTMLMixin Add CSS selectors to scrapy shell Documenting CSS Selectors: Constructing selectors Documenting CSS Selectors: Using Selectors Make CSS Selectors a default feature. Adds XPath powers to CSS Selectors and some syntactic sugar. Removes methods copied over from lxml.html.HtmlMixin. Updating docs to use new CSS Selector super powers. Documenting CSS Selectors: Regular Expressions Moving section after Nesting section, since it mentions it. Documenting CSS Selectors: Nesting Selectors Fix XPath specificity in lxml.selector.CSSSelectorMixin.text Cleaning up unused stuff from cssel.py Changing the behavior of lxml.selector.CSSSelectorMixin.text. Concatenating all of the descendant text nodes is more useful than returning it in pieces (there's xpath() if you need that). Documenting CSS Selectors: CSS Selector objects Documenting CSS Selectors: CSSSelectorList objects Documenting CSS Selectors: HtmlCSSSelector objects Documenting CSS Selectors: XmlCSSSelector objects Fixing some documentations typos and errors Enforcing the 80-char width lines Tidying up CSS selectors and CSSSelectorMixin objects Adding some missing references in documentation. Fixing lxml.selector.CSSSelectorList.text	2013-10-10 18:23:15 -02:00
Pablo Hoffman	8b9526a8f6	Merge pull request #400 from irgmedeiros/patch-2 Update the second code example	2013-10-07 07:57:18 -07:00
Pablo Hoffman	86c6e9433f	remove minor reference to 'scrapy server' command	2013-10-04 14:37:55 -03:00
Pablo Hoffman	37c24e01d7	document bindaddress request meta	2013-10-02 17:13:17 -03:00
Pablo Hoffman	a9c3519897	updated required twisted version to 10.0	2013-10-01 14:07:38 -03:00
Rolando Espinoza	d6e3eae527	docs: added section regarding setting up django's settings.	2013-09-30 09:58:10 -04:00
Rolando Espinoza	0cc1d870db	docs: minor tidy up sample code and missing shell prompts.	2013-09-30 09:58:10 -04:00
Loren Davie	8af0e89e85	Corrected typo.	2013-09-29 17:06:46 -04:00
Loren Davie	f49f5724d5	Added dynamic creation of item classes to practices.rst.	2013-09-28 09:00:48 -04:00
irgmedeiros	9b50409986	Update the second code example Update the second code example to reflect the last change in the first example.	2013-09-27 18:22:33 -03:00
irgmedeiros	d9e0fdc9aa	Update practices.rst With this modification scrapy runs the spider with project settings. The previous example ran only with default settings resulting in ignoring all user settings as pipelines for example.	2013-09-27 17:56:30 -03:00

1 2 3 4 5 ...

552 Commits