Rolando Espinoza
0cc1d870db
docs: minor tidy up sample code and missing shell prompts.
2013-09-30 09:58:10 -04:00
Loren Davie
8af0e89e85
Corrected typo.
2013-09-29 17:06:46 -04:00
Loren Davie
f49f5724d5
Added dynamic creation of item classes to practices.rst.
2013-09-28 09:00:48 -04:00
irgmedeiros
9b50409986
Update the second code example
...
Update the second code example to reflect the last change in the first example.
2013-09-27 18:22:33 -03:00
irgmedeiros
d9e0fdc9aa
Update practices.rst
...
With this modification scrapy runs the spider with project settings. The previous example ran only with default settings resulting in ignoring all user settings as pipelines for example.
2013-09-27 17:56:30 -03:00
Daniel Graña
265910aae6
Merge pull request #363 from taikano/sitemap_alternate
...
also fetch alternate URLs from sitemaps, see #360
2013-09-26 09:15:02 -07:00
Pablo Hoffman
12280c2a95
fix sphinx references in doc
2013-09-25 15:13:17 -03:00
Pablo Hoffman
fc388f4636
Make ITEM_PIPELINE setting a dict
...
This is for consistency with how spider and downloader middlewares are
defined. ITEM_PIPELINE_BASE was also added and both remain empty.
Backwards compatibility is kept (with a warning) with list-based
ITEM_PIPELINES.
2013-09-23 17:50:43 -03:00
cacovsky
71b320914a
Update request-response.rst
...
Fix small doc typo (too many backticks)
2013-09-18 11:45:25 -03:00
Stefan
6994959181
renamed to sitemap_alternate_links and added default value, see #360
2013-09-08 10:38:28 +02:00
Stefan
8ed2d0cda1
improved changes to allow retrieval of alternate links in sitemaps, see #360
2013-09-07 12:56:30 +02:00
Pablo Hoffman
86230c0ab8
added quantal & raring to support ubuntu releases
2013-08-22 21:49:55 -03:00
Mikhail Korobov
034ffae60f
Recommend Pillow instead of PIL. Closes GH-317.
2013-08-18 00:44:01 +06:00
Berend Iwema
32b6364bcd
#327 - Support STARTTLS / SSL option in email sender
2013-08-14 12:59:01 +02:00
Rocio Aramberri
d227d530f6
Added COMPRESSION_ENABLED setting to enable or disable the HttpCompressionMiddleware
...
Added COMPRESSION_ENABLE setting to docs
Added COMPRESSION_ENABLED setting to default settings
2013-08-01 11:31:28 -03:00
Dan
1ca31244b0
Fixed ordering of super argument call.
2013-07-16 14:50:10 -04:00
Dan
e12b689c4f
Updated documentation of spider arguments to include required super call.
2013-07-16 14:26:53 -04:00
Mikhail Korobov
1a1c93fafe
tiny FormRequest doc fix
2013-07-15 15:47:34 +06:00
Mikhail Korobov
ac2fadf3ab
DownloaderMiddleware.process_response docs fix
...
"returns an exception" -> "raises an exception"
2013-07-08 19:41:58 +06:00
Mikhail Korobov
39e5da5f66
improve docs for DownloaderMiddleware.process_response
2013-07-08 19:17:29 +06:00
Pablo Hoffman
0f4b70f582
remove no deprecated request_scheduled signal
...
It will be replaced by more accurate scheduler signals (proposal will
come soon)
2013-06-27 11:23:24 -03:00
nramirezuy
bef8ade956
removed request_received and added request_scheduled
2013-06-26 16:45:46 -03:00
Pablo Hoffman
819b2776dd
Merge pull request #326 from berendiwema/master
...
Include example of how to stop the reactor from script
2013-06-25 13:30:07 -07:00
nramirezuy
83b2774354
remove wrong default httpcache
2013-06-25 17:01:29 -03:00
Berend Iwema
aec314db09
added a bit more documentation on how to close the reactor when running scrapy from a script
2013-06-25 16:08:22 +02:00
Pablo Hoffman
bbde1d0e0b
Merge pull request #275 from stav/doc
...
doc: Response.replace() cannot take meta argument
2013-06-24 11:09:28 -07:00
Capi Etheriel
50fa46d183
Document CrawlSpider.parse_start_urls method
2013-06-09 04:03:20 -03:00
Pablo Hoffman
8e49fed918
minor improvements to benchmarking doc
2013-05-16 13:23:13 -03:00
Pablo Hoffman
76087e336a
add scrapy bench command for benchmarking, with documentation
2013-05-16 13:15:25 -03:00
Pablo Hoffman
66311db23e
mention crawlera in best practices, as a way to deal with bans
2013-05-04 18:20:23 -03:00
Pablo Hoffman
9361c89573
remove scrapyd doc, as it was moved to its own repo
2013-04-27 04:15:42 -03:00
Nicolás Ramírez
6df274bba5
added copy method to item
2013-04-19 13:23:53 -03:00
Pablo Hoffman
96c2332e0e
fix inaccurate downloader middleware documentation. refs #280
2013-04-02 11:35:32 -03:00
Steven Almeroth
70179c7c0c
doc: remove trailing spaces
2013-03-21 13:57:39 -06:00
Steven Almeroth
0d7747d353
doc: Response.replace() cannot take meta argument
...
>>> response.replace(meta={'foo':1})
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "/srv/scrapy/scrapy-fork/scrapy/scrapy/http/response/text.py", line 45, in replace
return Response.replace(self, *args, **kwargs)
File "/srv/scrapy/scrapy-fork/scrapy/scrapy/http/response/__init__.py", line 77, in replace
return cls(*args, **kwargs)
File "/srv/scrapy/scrapy-fork/scrapy/scrapy/http/response/text.py", line 22, in __init__
super(TextResponse, self).__init__(*args, **kwargs)
TypeError: __init__() got an unexpected keyword argument 'meta'
2013-03-21 13:49:55 -06:00
Pablo Hoffman
2a5c7ed4da
make Crawler.start() return a deferred that is fired when the crawl is finished
2013-03-20 14:48:59 -03:00
Pablo Hoffman
b347c14b5f
update engine status output on telnet console documentation
2013-03-18 19:12:12 -03:00
Shane Evans
5c2a82f1f7
fix typo
2013-03-17 19:34:55 +00:00
Steven Almeroth
f62b6660d4
doc: fix typo in spider middleware
2013-03-02 19:46:31 -06:00
Pablo Hoffman
7400ceb1ed
added 502 to RETRY_HTTP_CODES
2013-02-22 19:12:59 -02:00
Pablo Hoffman
a038f46859
doc: fixed rst title
2013-02-14 11:11:17 -02:00
Pablo Hoffman
22edc44c6c
doc: remove links to diveintopython.org, which is no longer available. closes #246
2013-02-14 11:09:40 -02:00
Daniel Graña
5db45b3825
remove scrapyd, it was migrated to its own repository
2013-02-06 05:24:07 +00:00
whodatninja
8e3b5baac5
Fix typo labeling attrs type bool instead of list
2013-02-05 15:10:41 -05:00
Chris Tilden
aae6aed4fb
fixes spelling errors in documentation
2013-01-22 14:52:18 -08:00
Pablo Hoffman
6ab8afb992
improve documentation about removing namespaces
2013-01-18 12:35:30 -02:00
Pablo Hoffman
1ba04b1fc3
added remove_namespaces() method to XmlXPathSelector objects
2013-01-18 12:20:03 -02:00
Pablo Hoffman
c31441a273
revert default HTTP cache policy to dummy (instead of RFC2616)
2013-01-17 13:08:29 -02:00
Daniel Graña
897195186a
document new FormRequest parameter named formxpath
that matches forms using xpath
2013-01-08 18:36:20 -02:00
Daniel Graña
75563b3f00
Add list of supported and missing RFC2616 caching features
2013-01-08 18:16:44 -02:00