Julia Medina
fc346cba4d
Move scrapy/contrib/spiders to scrapy/spiders
2015-04-29 21:27:19 -03:00
Julia Medina
7a92dae4c8
Change Scrapy log output through docs
2015-04-22 17:27:24 -03:00
Elias Dorneles
d63567531d
change data extraction in crawl example to be consistent with tutorial, removed statement implying mandatory usage of Item
2015-04-21 11:30:48 -03:00
Elias Dorneles
f7da69d116
fixing example CSS expr
2015-04-21 11:19:10 -03:00
Elias Dorneles
ff007afb9d
expanded crawling primer with examples, and applied other suggestions from the review
2015-04-21 10:57:44 -03:00
Elias Dorneles
595146e158
some improvements for Scrapy tutorial
2015-04-20 21:10:07 -03:00
Mikhail Korobov
1794a893f4
Merge pull request #1172 from bagratte/docs
...
minor corrections in documentation.
2015-04-19 21:41:29 +05:00
bagratte
beea9267a1
minor corrections in documentation.
2015-04-18 19:48:25 +04:00
rajathkumarmp
02629b5f7b
Added link to ipython in doc.
2015-04-18 13:00:34 +05:30
bagratte
8d339da4e5
add some minor stylistic and grammar corrections to tutorial.rst.
2015-04-17 20:55:02 +04:00
Peter Bronez
475766c73a
Converted sel.xpath() calls to response.xpath() in Extracting the data
2015-03-26 15:34:30 -04:00
Shadab Zafar
5a58d64131
Fix some redirection links in documentation
...
Fixes #606
2015-03-18 19:41:26 -03:00
Zbigniew Siciarz
0466e8cb7a
Fixed Python syntax in tutorial.
2014-07-04 10:38:01 +02:00
Daniel Graña
5b2faf61c3
recognize jl extension as jsonlines exporter and update docs
2014-06-25 13:55:15 -03:00
Julia Medina
bdca06240c
Fix settings repr on the logs of the shell and tutorial docs topics
2014-06-10 11:26:50 -03:00
Daniel Graña
1117687c47
update docs
2014-04-23 23:39:58 -03:00
Mikhail Korobov
2d3803672b
DOC use top-level shortcuts in docs
2014-04-15 01:09:35 +06:00
Alexey Bezhan
210a0a6fe1
Fix some typos, whitespace and small errors in docs
2014-02-27 18:02:22 +00:00
Paul Tremberth
57f30bcb04
Docs: 4-space indent for final spider example
2014-02-01 23:34:55 +01:00
Rolando Espinoza
4255e12bc7
Updated the tutorial crawl output with latest output.
2014-01-23 18:18:56 -04:00
Rolando Espinoza
9aab9224cb
Updated shell docs with the crawler reference and fixed the actual shell output.
...
Also updated the shell example with a reproducible code example.
2014-01-23 18:04:57 -04:00
Mikhail Korobov
a27d91f0a6
Rename BaseSpider to Spider. See GH-495.
2013-12-30 19:46:41 +06:00
RasPat1
ff21281b95
Note about selector class import
...
This is the salient point of this code compared to the last example. We have a selector now and this is how we use it. Especially since the user has just come from the shell where the pre-instantiated selector is taken for granted.
2013-12-15 13:46:42 -05:00
Pablo Hoffman
f2741c413e
fix method name in tutorial. closes GH-480
2013-12-02 13:24:12 -02:00
Daniel Graña
155ea08ea1
use sel
name for Selector's instances in docs, internals and shell
2013-10-15 15:58:42 -02:00
Daniel Graña
1abb1af0c6
fix typos and wording on selector's introduction
2013-10-15 10:13:43 -02:00
Daniel Graña
4645f9e03c
Updates docs to reflect unified selectors api
2013-10-14 16:31:20 -02:00
Pablo Hoffman
e1683ddf9b
fix doc typo
2013-10-09 17:24:12 -02:00
Pablo Hoffman
b1d1a36a1e
add note about enclosing urls with quotes when running from command-line. closes GH-384
2013-09-18 18:01:28 -03:00
Kumara Tharmalingam
bbb0603091
Fixed directory location for dmoz_spider.py file
...
It should be under 'tutorial/spiders' not 'dmoz/spiders'
2013-09-15 21:55:52 -07:00
Pablo Hoffman
22edc44c6c
doc: remove links to diveintopython.org, which is no longer available. closes #246
2013-02-14 11:09:40 -02:00
Valentin-Costel Hăloiu
00bfb37e79
Update master
2012-07-04 06:55:01 +03:00
Pablo Hoffman
2fb5e62c39
doc: update overview page to point to the genspider command. refs #107
2012-04-19 02:37:22 -03:00
Pablo Hoffman
0be421fbf0
fixed reference to tutorial directory
2011-12-23 18:57:11 -02:00
Daniel Graña
bcb31988f2
change tutorial to follow changes on dmoz site
2011-12-14 13:03:31 -02:00
Pablo Hoffman
ade5efdc61
added -o option to scrapy crawl, a convenient shortcut for using feed exports
2011-10-22 20:53:49 -02:00
Pablo Hoffman
76af0cdd44
updated documentation and code to use -s instead of --set option
2011-09-01 14:35:37 -03:00
Pablo Hoffman
5bf733b6f6
Changed default representation of items to pretty-printed dicts. This improves
...
default logging by making log more readable in the default case, for both Scraped and Dropped lines.
Projects can still customize how items are represented by overriding the item's __str__ method, as usual.
2011-06-03 01:13:01 -03:00
Pablo Hoffman
951ba507f9
Removed support for default values in Scrapy items, which have proven confusing in the past
2011-05-19 21:42:46 -03:00
Pablo Hoffman
503f302010
removed remaining references to scheduler middleware from doc, as it will be removed on next release
2011-05-18 19:48:48 -03:00
Pablo Hoffman
bb2b67c862
updated tutorial to use 'dmoz' as the name of the spider instead of 'dmoz.org', so that it's more similar to the dirbot example project
2011-04-28 09:31:57 -03:00
Pablo Hoffman
bf73002428
removed googledir example, replaced by dirbot project on github. updated docs accordingly
2011-04-28 02:28:39 -03:00
Pablo Hoffman
181d1c09ae
Fixed typo and code indentation in the doc. Closes #307 and #308
2011-02-09 11:19:46 -02:00
Pablo Hoffman
b4fbc6c5fa
Updated Scrapy Tutorial to reference feed exports, instead a custom written pipeline, and extended item pipeline documentation to include a JSON writer.
2010-10-10 20:31:05 -02:00
Pablo Hoffman
f4accb6c7f
Updated dmoz xpaths of Scrapy tutorial
2010-10-07 18:22:01 -02:00
Pablo Hoffman
9aefa242d5
Applied documentation patch provided by Lucian Ursu ( closes #207 )
2010-08-21 01:26:35 -03:00
Pablo Hoffman
1d3b9e2ca8
Scrapy shell refactoring
2010-08-20 11:26:14 -03:00
Pablo Hoffman
7858244dca
Scrapy shell: moved python console starting code to scrapy.utils.console and get rid of noisy console banners
2010-08-20 01:33:02 -03:00
Pablo Hoffman
34554da201
Deprecated scrapy-ctl.py command in favour of simpler "scrapy" command. Closes #199 . Also updated documenation accordingly and added convenient scrapy.bat script for running from Windows.
...
--HG--
rename : debian/scrapy-ctl.1 => debian/scrapy.1
rename : docs/topics/scrapy-ctl.rst => docs/topics/cmdline.rst
2010-08-18 19:48:32 -03:00
Pablo Hoffman
43d47e5d9b
Some improvements to Item Pipeline ( closes #195 ):
...
* Made Item Pipeline Manager a subclass of scrapy.middleware.MiddlewareManager
* Added open_spider/close_spider methods with support for returning deferreds from them
* Inverted the process_item() arguments to be more friendly with deferred
callbacks (backwards compatibility kept through arguments introspection)
* Updated documentation with new methods and process_item() arguments change
2010-08-12 10:48:37 -03:00