Paul Tremberth
|
57f30bcb04
|
Docs: 4-space indent for final spider example
|
2014-02-01 23:34:55 +01:00 |
|
Pablo Hoffman
|
e31fb49320
|
make 'basic' the default template spider in genspider, and added info with next steps to startproject. closes #488
|
2014-02-01 17:38:31 -02:00 |
|
Pablo Hoffman
|
c928eef585
|
Merge pull request #485 from bblanchon/sgml-extractor-link-text
SgmlLinkExtractor: Fixed link text when there is an inner tag
|
2014-02-01 10:28:56 -08:00 |
|
Daniel Graña
|
e1c6d3ffdf
|
Merge pull request #566 from redapple/offsite-stats
OffsiteMiddleware: add 2 stats counters
|
2014-01-31 10:05:35 -08:00 |
|
Daniel Graña
|
1e553ec0cd
|
Merge pull request #570 from redapple/travistests
[MRG] Fix tests for Travis-CI build
|
2014-01-31 09:16:54 -08:00 |
|
Paul Tremberth
|
cc157f182b
|
Fix tests for Travis-CI build
|
2014-01-31 01:27:54 +01:00 |
|
Mikhail Korobov
|
05158803df
|
Merge pull request #571 from darkrho/httpcache-storage-doc
DOC Fixed HTTPCACHE_STORAGE typo in the default value which is now Files...
|
2014-01-30 08:48:39 -08:00 |
|
Rolando Espinoza
|
a6279fe95b
|
DOC Fixed HTTPCACHE_STORAGE typo in the default value which is now Filesystem instead Dbm.
|
2014-01-30 11:53:42 -04:00 |
|
Paul Tremberth
|
fd5b40593a
|
Always enable offsite stats + refactor test to initialize crawler
|
2014-01-29 17:37:24 +01:00 |
|
Paul Tremberth
|
1a545157c6
|
Offsite: add 2 stats counters
|
2014-01-29 17:37:24 +01:00 |
|
Pablo Hoffman
|
21f0e4048b
|
Merge pull request #565 from dangra/562-sgmlinkextractor
replace unencodeable codepoints with html entities
|
2014-01-27 22:01:39 -08:00 |
|
Pablo Hoffman
|
116a1df4b0
|
Merge pull request #557 from darkrho/expose-crawler-in-shell
Expose current crawler in the scrapy shell.
|
2014-01-27 21:08:02 -08:00 |
|
Daniel Graña
|
66829c962f
|
replace unencodeable codepoints with html entities. fixes #562 and #285
|
2014-01-27 11:37:09 -02:00 |
|
Daniel Graña
|
8ecf0b786d
|
Merge pull request #556 from darkrho/item-loader-nones
Make `ItemLoader` ignore `None` values from processors.
|
2014-01-27 04:21:09 -08:00 |
|
Daniel Graña
|
b14dabb281
|
Merge pull request #561 from redapple/regexlx-encoding
RegexLinkExtractor: encode URL unicode value when creating Links
|
2014-01-24 07:33:23 -08:00 |
|
Rolando Espinoza
|
4255e12bc7
|
Updated the tutorial crawl output with latest output.
|
2014-01-23 18:18:56 -04:00 |
|
Rolando Espinoza
|
9aab9224cb
|
Updated shell docs with the crawler reference and fixed the actual shell output.
Also updated the shell example with a reproducible code example.
|
2014-01-23 18:04:57 -04:00 |
|
Paul Tremberth
|
f87859a627
|
RegexLinkExtractor: encode URL unicode value when creating Links
|
2014-01-23 19:41:18 +01:00 |
|
Rolando Espinoza
|
4081ba238d
|
PEP8 minor edits.
|
2014-01-23 11:13:54 -04:00 |
|
Rolando Espinoza
|
240fdde667
|
Expose current crawler in the scrapy shell.
|
2014-01-23 11:08:56 -04:00 |
|
Rolando Espinoza
|
b93412059d
|
Unused re import and PEP8 minor edits.
|
2014-01-23 10:37:33 -04:00 |
|
Rolando Espinoza
|
420efe77b2
|
Ignore None's values when using the ItemLoader.
|
2014-01-23 10:36:06 -04:00 |
|
Pablo Hoffman
|
2d60f86084
|
Merge pull request #535 from redapple/xpath-smartstrings
Disable smart strings in lxml XPath evaluations
|
2014-01-22 07:41:31 -08:00 |
|
Daniel Graña
|
677afe7e54
|
show ubuntu setup instructions as literal code
|
2014-01-20 16:22:53 -02:00 |
|
Daniel Graña
|
1c514c5fe9
|
Merge pull request #549 from dangra/509-scrapy-apt-repo
[MRG] update instruction to install using ubuntu packages
|
2014-01-20 09:27:57 -08:00 |
|
Daniel Graña
|
ebfb5b7096
|
replace warning about updating package lists by a note on package upgrade
|
2014-01-20 15:18:34 -02:00 |
|
Daniel Graña
|
52c3ff9190
|
fix apt-get line
|
2014-01-20 14:39:26 -02:00 |
|
Paul Tremberth
|
a0e25aec00
|
Use assertTrue/False
|
2014-01-20 17:29:16 +01:00 |
|
Daniel Graña
|
eb73ddd301
|
Update Ubuntu installation instructions
|
2014-01-20 14:15:01 -02:00 |
|
stray-leone
|
7f30a671c3
|
modify the version of scrapy ubuntu package
latest version is 0.22.
with scrapy-0.18, tutorial project provides error
relative issue : https://github.com/scrapy/scrapy/issues/511
|
2014-01-20 10:24:34 -02:00 |
|
Daniel Graña
|
431f2d109f
|
fix 0.22.0 release date
|
2014-01-17 17:54:01 -02:00 |
|
Mikhail Korobov
|
dfd13f9941
|
fix typos in news.rst and remove (not released yet) header
|
2014-01-18 01:50:19 +06:00 |
|
Paul Tremberth
|
001cf39ff4
|
Add testcase to check is default Selector doesnt return smart strings
|
2014-01-17 20:34:59 +01:00 |
|
Paul Tremberth
|
5eb336215c
|
Make lxml smart strings functionality customizable
|
2014-01-17 20:34:59 +01:00 |
|
Paul Tremberth
|
1f184ed7bf
|
Disable smart strings in lxml XPath evaluations
|
2014-01-17 20:33:32 +01:00 |
|
Daniel Graña
|
d8164bd5f4
|
bump version to 0.23
0.23.0
|
2014-01-17 15:57:10 -02:00 |
|
Daniel Graña
|
d1f1b074e1
|
Merge 0.22.0 release notes
|
2014-01-17 15:55:46 -02:00 |
|
Daniel Graña
|
3c1e22618b
|
Merge pull request #541 from kmike/fs-as-default-cache
[MRG] Make Filesystem storage backend default again.
|
2014-01-17 09:10:26 -08:00 |
|
Mikhail Korobov
|
7b7a1d8dfd
|
Make Filesystem storage backend default again. See GH-500.
|
2014-01-17 04:32:08 +06:00 |
|
Daniel Graña
|
f0851e41ec
|
Merge pull request #478 from redapple/offsitetests
Add tests for OffsiteMiddleware() + use re.escape() in domains regexp
|
2014-01-16 14:02:26 -08:00 |
|
Daniel Graña
|
151f9478c1
|
Merge pull request #538 from kmike/ajaxcrawlable-rename
[MRG] Rename AjaxCrawlableMiddleware to AjaxCrawlMiddleware
|
2014-01-16 10:33:46 -08:00 |
|
Mikhail Korobov
|
b03fe04999
|
Rename AjaxCrawlableMiddleware to AjaxCrawlMiddleware
|
2014-01-16 23:09:37 +06:00 |
|
Pablo Hoffman
|
ed6fd4933f
|
Merge pull request #524 from hobsonlane/master
documentation code example corrections per pablohoffman
|
2014-01-16 06:44:51 -08:00 |
|
Pablo Hoffman
|
71ada5476e
|
Merge pull request #472 from redapple/exslt
Register EXSLT namespaces by default (resolves #470)
|
2014-01-16 06:32:05 -08:00 |
|
Daniel Graña
|
b9bb9bed6b
|
Merge pull request #343 from kmike/ajax-crawlable
[MRG] AjaxCrawlableMiddleware
|
2014-01-16 05:07:39 -08:00 |
|
Daniel Graña
|
e2a5310f92
|
Merge pull request #537 from dangra/warn-xpathselector-subclass
warn XPathSelector deprecation on subclassing and direct instance
|
2014-01-16 04:14:27 -08:00 |
|
Daniel Graña
|
5a175ad287
|
Merge pull request #519 from dangra/warn-once
Warn BaseSpider deprecation only once
|
2014-01-16 04:07:21 -08:00 |
|
Daniel Graña
|
b3be6e210d
|
warn XPathSelector deprecation on subclassing and direct instance
|
2014-01-16 09:53:59 -02:00 |
|
Daniel Graña
|
3e42646ce1
|
do not test multiple instantation warnings
|
2014-01-16 09:49:51 -02:00 |
|
Hobson Lane
|
85a80d0752
|
remove "for brevity's sake" line and correct "Torrent item"
Torrent item -> TorrentItem class
|
2014-01-15 17:29:23 -08:00 |
|