Elias Dorneles
4dcecc98f9
moved example data to a better place
2015-03-26 15:45:17 -03:00
Elias Dorneles
7402e27230
fix community link
2015-03-26 15:35:31 -03:00
Elias Dorneles
729861c864
fixing indentation
2015-03-26 15:31:42 -03:00
Elias Dorneles
13d0ecde77
addressing more review comments, to avoid ambiguity on desired reading flow
2015-03-26 15:26:16 -03:00
Elias Dorneles
76e3bf1250
addressing comments from the review plus further editing
2015-03-26 14:26:20 -03:00
Elias Dorneles
8f4a268f37
added bit about async requests, improved phrasing
2015-03-26 12:14:56 -03:00
Elias Dorneles
32423d4a33
some improvements to overview page
2015-03-25 19:27:52 -03:00
Shadab Zafar
5a58d64131
Fix some redirection links in documentation
...
Fixes #606
2015-03-18 19:41:26 -03:00
Daniel Graña
a9292cfab7
jsonrpc webservice moved to https://github.com/scrapy/scrapy-jsonrpc repository
2014-08-15 23:28:13 -03:00
Daniel Graña
2ad8db6ae6
Merge pull request #761 from dangra/lxmlextractor
...
Promote LxmlLinkExtractor as LxmlExtractor
2014-06-25 15:07:02 -03:00
Daniel Graña
a9ecef5662
promote LxmlLinkExtractor as default in docs
2014-06-25 14:34:30 -03:00
Daniel Graña
5b2faf61c3
recognize jl extension as jsonlines exporter and update docs
2014-06-25 13:55:15 -03:00
Carlos Rivera
946b854ddf
grammatical issue
2014-05-06 15:41:59 -05:00
Daniel Graña
1117687c47
update docs
2014-04-23 23:39:58 -03:00
Mikhail Korobov
2d3803672b
DOC use top-level shortcuts in docs
2014-04-15 01:09:35 +06:00
Julia Medina
80081054a2
Fix broken links in documentation
2014-04-09 18:57:52 -03:00
Pablo Hoffman
ed6fd4933f
Merge pull request #524 from hobsonlane/master
...
documentation code example corrections per pablohoffman
2014-01-16 06:44:51 -08:00
Hobson Lane
85a80d0752
remove "for brevity's sake" line and correct "Torrent item"
...
Torrent item -> TorrentItem class
2014-01-15 17:29:23 -08:00
Hobson Lane
a3db95985b
another import name correction by pablo
2014-01-14 21:04:15 -08:00
Ferdy Rodriguez
807dd25324
fixed error on tor's name
2014-01-13 00:03:58 -06:00
Ferdy Rodriguez
8b9348cfaf
Changed TOR Info as previous was removed from www.mininova.org
2014-01-12 23:46:04 -06:00
Hobson Lane
6ba0857a5c
documentation code example correction corrections per pablohoffman
2014-01-10 10:37:27 -08:00
Pablo Hoffman
e8ee449a2a
Merge pull request #432 from darkrho/crawl-url
...
Removed URL reference in crawl command and .tld suffix in docs for spider names
2013-10-21 09:40:58 -07:00
Rolando Espinoza La fuente
34543c2b2e
DOCS removed .tld suffix for spider names for the sake of consistency.
2013-10-19 23:03:20 -04:00
Daniel Graña
155ea08ea1
use sel
name for Selector's instances in docs, internals and shell
2013-10-15 15:58:42 -02:00
Daniel Graña
4645f9e03c
Updates docs to reflect unified selectors api
2013-10-14 16:31:20 -02:00
Hart
c00c4d7148
correction to description of example XPath retrieval in overview doc
2013-08-03 17:08:58 -07:00
Juan M Uys
4de3aa4932
Update overview.rst
2013-04-08 14:13:15 +02:00
Pablo Hoffman
2fb5e62c39
doc: update overview page to point to the genspider command. refs #107
2012-04-19 02:37:22 -03:00
Pablo Hoffman
ade5efdc61
added -o option to scrapy crawl, a convenient shortcut for using feed exports
2011-10-22 20:53:49 -02:00
Pablo Hoffman
76af0cdd44
updated documentation and code to use -s instead of --set option
2011-09-01 14:35:37 -03:00
Pablo Hoffman
5da6ffb57b
Automated merge with ssh://hg.scrapy.org:2222/scrapy-0.12
2011-08-11 09:11:19 -03:00
Pablo Hoffman
bc2d2183e9
fixed import in doc
2011-08-11 09:11:08 -03:00
Pablo Hoffman
c59340150f
Added cached DNS resolver based on old caching resolver extension from scrapy.contrib.resolver. This new one is *not* an extension, it comes builtin and always enabled.
2011-07-27 03:45:15 -03:00
Pablo Hoffman
57c43fdce6
added SitemapSpider, with tests and doc
2011-06-15 11:54:34 -03:00
Pablo Hoffman
5f65c26080
Some minor improvements to feature list in Scrapy at a Glance documentation page
2010-10-16 19:02:08 -02:00
Pablo Hoffman
e3d67d74f7
docs/intro/overview.rst: add example of scraped data and introduce loaders
2010-09-06 10:04:00 -03:00
Pablo Hoffman
00d55fbbd1
Updated 'Scrapy at a glance' document replacing item pipeline example by a simpler usage of feed exports
2010-09-05 23:38:37 -03:00
Pablo Hoffman
9aefa242d5
Applied documentation patch provided by Lucian Ursu ( closes #207 )
2010-08-21 01:26:35 -03:00
Pablo Hoffman
e741a807d2
Added new Feed exports extension with documentation and storage tests. Closes #197 .
...
Also deprecated File export pipeline (to be removed in Scrapy 0.11).
Still need to add tests for FeedExport main extension code.
2010-08-17 14:27:48 -03:00
Pablo Hoffman
43d47e5d9b
Some improvements to Item Pipeline ( closes #195 ):
...
* Made Item Pipeline Manager a subclass of scrapy.middleware.MiddlewareManager
* Added open_spider/close_spider methods with support for returning deferreds from them
* Inverted the process_item() arguments to be more friendly with deferred
callbacks (backwards compatibility kept through arguments introspection)
* Updated documentation with new methods and process_item() arguments change
2010-08-12 10:48:37 -03:00
Pablo Hoffman
6a33d6c4d0
* Added Scrapy Web Service with documentation and tests.
...
* Marked Web Console as deprecated.
* Removed Web Console documentation to discourage its use.
2010-06-09 13:46:22 -03:00
Rolando Espinoza La fuente
db5c3df679
SEP12 implementation
...
* Rename BaseSpider.domain_name to BaseSpider.name
This patch implements the domain_name to name change in BaseSpider class and
change all spider instantiations to use the new attribute.
* Add allowed_domains to spider
This patch implements the merging of spider.domain_name and
spider.extra_domain_names in spider.allowed_domains for offsite checking
purposes.
Note that spider.domain_name is not touched by this patch, only not used.
* Remove spider.domain_name references from scrapy.stats
* Rename domain_stats to spider_stats in MemoryStatsCollector
* Use ``spider`` instead of ``domain`` in SimpledbStatsCollector
* Rename domain_stats_history table to spider_data_history and rename domain
field to spider in MysqlStatsCollector
* Refactor genspider command
The new signature for genspider is: genspider [options] <domain_name>.
Genspider uses domain_name for spider name and for the module name.
* Remove spider.domain_name references
* Update crawl command signature <spider|url>
* docs: updated references to domain_name
* examples/experimental: use spider.name
* genspider: require <name> <domain>
* spidermanager: renamed crawl_domain to crawl_spider_name
* spiderctl: updated references of *domain* to spider
* added backward compatiblity with legacy spider's attributes
'domain_name' and 'extra_domain_names'
2010-04-01 18:27:22 -03:00
Pablo Hoffman
99a876754c
Improved "What else?" section of "Scrapy at a glance" overview
2010-03-20 20:24:18 -03:00
Pablo Hoffman
7728a23e99
Changed item pipeline API to pass spider references (instead of domain names) to process_item() method
2009-11-06 13:46:36 -02:00
Ismael Carnales
5862ba7db7
modified doc to reflect the new spider callback return policy (lists not needed)
2009-09-22 11:25:40 -03:00
Ismael Carnales
39540b188a
changed torrent in overview doc
2009-08-24 15:11:04 -03:00
Pablo Hoffman
9635a7839c
rearranged documentation into a better organization
...
--HG--
rename : docs/topics/index.rst => docs/index.rst
2009-08-21 21:49:54 -03:00
Pablo Hoffman
e8504a054c
moved scrapy.newitem to scrapy.item and declared newitem api officially stable. updated docs and example project. deprecated old ScrapedItem
2009-08-19 21:39:58 -03:00
Ismael Carnales
48b40bd620
renamed x method of selectors to select
2009-08-17 15:58:06 -03:00