1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-27 00:24:14 +00:00

657 Commits

Author SHA1 Message Date
Rolando Espinoza
0cc1d870db docs: minor tidy up sample code and missing shell prompts. 2013-09-30 09:58:10 -04:00
Loren Davie
8af0e89e85 Corrected typo. 2013-09-29 17:06:46 -04:00
Loren Davie
f49f5724d5 Added dynamic creation of item classes to practices.rst. 2013-09-28 09:00:48 -04:00
irgmedeiros
9b50409986 Update the second code example
Update the second code example to reflect the last change in the first example.
2013-09-27 18:22:33 -03:00
irgmedeiros
d9e0fdc9aa Update practices.rst
With this modification scrapy runs the spider with project settings. The previous example ran only with default settings resulting in ignoring all user settings as pipelines for example.
2013-09-27 17:56:30 -03:00
Daniel Graña
265910aae6 Merge pull request #363 from taikano/sitemap_alternate
also fetch alternate URLs from sitemaps, see #360
2013-09-26 09:15:02 -07:00
Pablo Hoffman
12280c2a95 fix sphinx references in doc 2013-09-25 15:13:17 -03:00
Pablo Hoffman
fc388f4636 Make ITEM_PIPELINE setting a dict
This is for consistency with how spider and downloader middlewares are
defined. ITEM_PIPELINE_BASE was also added and both remain empty.

Backwards compatibility is kept (with a warning) with list-based
ITEM_PIPELINES.
2013-09-23 17:50:43 -03:00
cacovsky
71b320914a Update request-response.rst
Fix small doc typo (too many backticks)
2013-09-18 11:45:25 -03:00
Stefan
6994959181 renamed to sitemap_alternate_links and added default value, see #360 2013-09-08 10:38:28 +02:00
Stefan
8ed2d0cda1 improved changes to allow retrieval of alternate links in sitemaps, see #360 2013-09-07 12:56:30 +02:00
Pablo Hoffman
86230c0ab8 added quantal & raring to support ubuntu releases 2013-08-22 21:49:55 -03:00
Mikhail Korobov
034ffae60f Recommend Pillow instead of PIL. Closes GH-317. 2013-08-18 00:44:01 +06:00
Berend Iwema
32b6364bcd #327 - Support STARTTLS / SSL option in email sender 2013-08-14 12:59:01 +02:00
Rocio Aramberri
d227d530f6 Added COMPRESSION_ENABLED setting to enable or disable the HttpCompressionMiddleware
Added COMPRESSION_ENABLE setting to docs

Added COMPRESSION_ENABLED setting to default settings
2013-08-01 11:31:28 -03:00
Dan
1ca31244b0 Fixed ordering of super argument call. 2013-07-16 14:50:10 -04:00
Dan
e12b689c4f Updated documentation of spider arguments to include required super call. 2013-07-16 14:26:53 -04:00
Mikhail Korobov
1a1c93fafe tiny FormRequest doc fix 2013-07-15 15:47:34 +06:00
Mikhail Korobov
ac2fadf3ab DownloaderMiddleware.process_response docs fix
"returns an exception" -> "raises an exception"
2013-07-08 19:41:58 +06:00
Mikhail Korobov
39e5da5f66 improve docs for DownloaderMiddleware.process_response 2013-07-08 19:17:29 +06:00
Pablo Hoffman
0f4b70f582 remove no deprecated request_scheduled signal
It will be replaced by more accurate scheduler signals (proposal will
come soon)
2013-06-27 11:23:24 -03:00
nramirezuy
bef8ade956 removed request_received and added request_scheduled 2013-06-26 16:45:46 -03:00
Pablo Hoffman
819b2776dd Merge pull request #326 from berendiwema/master
Include example of how to stop the reactor from script
2013-06-25 13:30:07 -07:00
nramirezuy
83b2774354 remove wrong default httpcache 2013-06-25 17:01:29 -03:00
Berend Iwema
aec314db09 added a bit more documentation on how to close the reactor when running scrapy from a script 2013-06-25 16:08:22 +02:00
Pablo Hoffman
bbde1d0e0b Merge pull request #275 from stav/doc
doc: Response.replace() cannot take meta argument
2013-06-24 11:09:28 -07:00
Capi Etheriel
50fa46d183 Document CrawlSpider.parse_start_urls method 2013-06-09 04:03:20 -03:00
Pablo Hoffman
8e49fed918 minor improvements to benchmarking doc 2013-05-16 13:23:13 -03:00
Pablo Hoffman
76087e336a add scrapy bench command for benchmarking, with documentation 2013-05-16 13:15:25 -03:00
Pablo Hoffman
66311db23e mention crawlera in best practices, as a way to deal with bans 2013-05-04 18:20:23 -03:00
Pablo Hoffman
9361c89573 remove scrapyd doc, as it was moved to its own repo 2013-04-27 04:15:42 -03:00
Nicolás Ramírez
6df274bba5 added copy method to item 2013-04-19 13:23:53 -03:00
Pablo Hoffman
96c2332e0e fix inaccurate downloader middleware documentation. refs #280 2013-04-02 11:35:32 -03:00
Steven Almeroth
70179c7c0c doc: remove trailing spaces 2013-03-21 13:57:39 -06:00
Steven Almeroth
0d7747d353 doc: Response.replace() cannot take meta argument
>>> response.replace(meta={'foo':1})
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/srv/scrapy/scrapy-fork/scrapy/scrapy/http/response/text.py", line 45, in replace
    return Response.replace(self, *args, **kwargs)
  File "/srv/scrapy/scrapy-fork/scrapy/scrapy/http/response/__init__.py", line 77, in replace
    return cls(*args, **kwargs)
  File "/srv/scrapy/scrapy-fork/scrapy/scrapy/http/response/text.py", line 22, in __init__
    super(TextResponse, self).__init__(*args, **kwargs)
TypeError: __init__() got an unexpected keyword argument 'meta'
2013-03-21 13:49:55 -06:00
Pablo Hoffman
2a5c7ed4da make Crawler.start() return a deferred that is fired when the crawl is finished 2013-03-20 14:48:59 -03:00
Pablo Hoffman
b347c14b5f update engine status output on telnet console documentation 2013-03-18 19:12:12 -03:00
Shane Evans
5c2a82f1f7 fix typo 2013-03-17 19:34:55 +00:00
Steven Almeroth
f62b6660d4 doc: fix typo in spider middleware 2013-03-02 19:46:31 -06:00
Pablo Hoffman
7400ceb1ed added 502 to RETRY_HTTP_CODES 2013-02-22 19:12:59 -02:00
Pablo Hoffman
a038f46859 doc: fixed rst title 2013-02-14 11:11:17 -02:00
Pablo Hoffman
22edc44c6c doc: remove links to diveintopython.org, which is no longer available. closes #246 2013-02-14 11:09:40 -02:00
Daniel Graña
5db45b3825 remove scrapyd, it was migrated to its own repository 2013-02-06 05:24:07 +00:00
whodatninja
8e3b5baac5 Fix typo labeling attrs type bool instead of list 2013-02-05 15:10:41 -05:00
Chris Tilden
aae6aed4fb fixes spelling errors in documentation 2013-01-22 14:52:18 -08:00
Pablo Hoffman
6ab8afb992 improve documentation about removing namespaces 2013-01-18 12:35:30 -02:00
Pablo Hoffman
1ba04b1fc3 added remove_namespaces() method to XmlXPathSelector objects 2013-01-18 12:20:03 -02:00
Pablo Hoffman
c31441a273 revert default HTTP cache policy to dummy (instead of RFC2616) 2013-01-17 13:08:29 -02:00
Daniel Graña
897195186a document new FormRequest parameter named formxpath that matches forms using xpath 2013-01-08 18:36:20 -02:00
Daniel Graña
75563b3f00 Add list of supported and missing RFC2616 caching features 2013-01-08 18:16:44 -02:00