1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-25 09:43:46 +00:00

779 Commits

Author SHA1 Message Date
Jakob de Maeyer
a769a1ef78 Introduce BaseSettings with full dictionary interface 2015-10-27 12:38:52 +01:00
Valdir Stumm Jr
d577c4702d fixed a typo in the documentation. 2015-10-26 00:00:20 -02:00
Mikhail Korobov
1b6d60c251 DOC fix docs after GH-1289. 2015-10-09 01:26:09 +05:00
hoatle
2869cf8dde fix another invalid xpath error 2015-10-07 16:03:43 +07:00
Hoat Le
4e66955411 fix ValueError: Invalid XPath: //div/[id="not-exists"]/text() on selectors.rst
>>> response.xpath('//div/[id="not-exists"]/text()').extract_first() is None
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/home/vagrant/.virtualenvs/scrapy/lib/python2.7/site-packages/scrapy/http/response/text.py", line 109, in xpath
    return self.selector.xpath(query)
  File "/home/vagrant/.virtualenvs/scrapy/lib/python2.7/site-packages/scrapy/selector/unified.py", line 100, in xpath
    raise ValueError(msg if six.PY3 else msg.encode("unicode_escape"))
ValueError: Invalid XPath: //div/[id="not-exists"]/text()
2015-10-07 15:43:02 +07:00
smirecki
8379bea7ed Typos corrections
I've made a few small corrections, some spelling changes and typo fixes.
I've tried to respect regional spelling differences and avoided proposing hyphenating compound words.

 Please enter the commit message for your changes. Lines starting
2015-10-02 23:48:27 -04:00
Marius Gedminas
0620e76433 Fix list formatting 2015-09-29 03:33:30 +05:00
Marius Gedminas
eaad10facf Typo 2015-09-28 14:44:15 +05:00
Daniel Graña
3c596dcf46 Merge pull request #1467 from dacjames/master
add support for a nested loaders
2015-09-16 22:03:19 -03:00
hy
14f7f22555 fix typos in downloader-middleware.rst and exceptions.rst, middlware -> middleware 2015-09-16 16:59:23 +08:00
Daniel Collins
036109e7de updte nested loader documentation 2015-09-15 23:49:35 -07:00
Daniel Collins
88c92cb68b provide documentation for nested loaders 2015-09-04 13:15:48 -07:00
Artur Gaspar
9ce9a293a6 Always check robots.txt before making another request in RobotsTxtMiddleware. 2015-09-02 10:23:24 -03:00
Artur Gaspar
ca83a0b028 Support for returning deferreds in downloader middleware methods. 2015-09-01 13:22:43 -03:00
David Tagatac
08162a15d8 minor: scrapy.Spider docs grammar 2015-08-27 17:37:16 -04:00
Mikhail Korobov
9616d91e4a Merge pull request #1444 from cyberplant/bpython_support
[MRG +1] bpython support
2015-08-27 21:28:05 +05:00
Mikhail Korobov
cfae62f9cc Merge pull request #1441 from aivarsk/fix-common-practices
Make common practices sample code match the comments
2015-08-23 17:36:09 +05:00
Jakob de Maeyer
d164398a27 Fix RedirectMiddleware not honouring meta handle_httpstatus keys 2015-08-21 13:22:42 +02:00
nyov
509cc8d41e Add support for bpython console.
Adds support for configuration of shells from scrapy.cfg
and SCRAPY_PYTHON_SHELL.

config snippet:

cat <<EOF >> ~/.scrapy.cfg
[settings]
# shell can be one of ipython, bpython or python;
# to be tried as the interactive python console
# (in above order, unless set here).
shell = python
EOF

(closes #270, #1100, #1301)
2015-08-21 01:12:58 +01:00
Aivars Kalvāns
b8b1e8e544 Make common practices sample code match the comments 2015-08-19 16:54:10 +03:00
Daniel Graña
5e6c492967 Merge pull request #1364 from jdemaeyer/enhancement/spider-handles-redirects
[MRG+1] Make RedirectMiddleware respect Spider.handle_httpstatus_list
2015-08-02 23:00:00 -03:00
David Tagatac
08123207c5 minor: scrapy.Spider grammar and clarity 2015-07-31 17:01:59 -04:00
Jakob de Maeyer
c908d31660 Make RedirectMiddleware respect Spider.handle_httpstatus_list 2015-07-16 12:50:26 +02:00
Julia Medina
d706310d8b Merge pull request #1151 from marven/cache-control
[MRG+1] RFC2616 policy enhancements + tests
2015-07-11 08:06:20 -03:00
Julia Medina
8b3ca4f250 Merge pull request #1302 from eliasdorneles/improving-access-settings-docs
[MRG+1] Improvements for docs on how to access settings
2015-07-03 00:56:32 -03:00
Daniel Graña
3fc4e0b319 Merge pull request #1282 from otherchirps/memusage-check-interval
[MRG+1] Added MEMUSAGE_CHECK_INTERVAL_SECONDS to Memory usage extension options.
2015-07-02 13:50:55 -03:00
Mikhail Korobov
d850238c22 add AUTOTHROTTLE_TARGET_CONCURRENCY option and expand AutoThrottle docs 2015-06-27 04:59:42 +05:00
Mikhail Korobov
63317531f9 DOC fix authrottle docs
see https://github.com/scrapy/scrapy/pull/502/files#r8574692
2015-06-26 20:47:58 +05:00
Pablo Hoffman
38e5bfb61c remove version suffix from ubuntu package 2015-06-22 10:57:24 -03:00
Elias Dorneles
2de5c66058 improvements for docs on how to access settings 2015-06-15 13:07:55 -03:00
Julia Medina
fa1c25c840 Merge pull request #1286 from scrapy/configure_logging
configure_logging: change the meaning of settings=None
2015-06-12 13:22:42 -03:00
Julia Medina
36bc912cdd DOC indent additional docs for configure_logging 2015-06-12 13:00:31 -03:00
Bryan Crowe
6a4c475e87 Fix a couple typos 2015-06-11 19:47:30 -04:00
Daniel Graña
5bd0395be4 Merge pull request #1291 from scrapy/signalmanager-docstrings
DOC SignalManager docstrings. See GH-713.
2015-06-10 16:28:35 -03:00
Mikhail Korobov
6c9daf3a95 DOC remove unnecessary links; fix references in send_catch_log_deferred docstring 2015-06-10 01:44:19 +05:00
Mikhail Korobov
a611f8dd2d DOC remove FailureFormatter mentions, stop copy-pasting configure_logging docstring 2015-06-09 22:57:18 +05:00
Mikhail Korobov
790c67b643 DOC spider_error doesn't support deferreds 2015-06-09 02:20:10 +05:00
Mikhail Korobov
1740fcf1a6 DOC SignalManager docstrings. See GH-713.
This change is not 100% backwards compatible because of *args changes.
Their usage was not documented, so we're not breaking public interface.
2015-06-08 21:05:58 +05:00
Mikhail Korobov
9a787893e3 (backwards-incompatible) allow to pass settings=None to configure_logging
* use explicit argument for disabling root handler;
* handle LOG_STDOUT even if install_root_handler is False
2015-06-08 19:54:18 +05:00
Chris Nilsson
0c532baf4c Removed typo, and clarified time unit of setting 2015-06-06 11:18:13 +10:00
Mikhail Korobov
d047665c02 make "settings" argument optional for Crawler, CrawlerRunner and CrawlerProcess 2015-06-06 03:23:13 +05:00
Chris Nilsson
eae25a04d9 Added MEMUSAGE_CHECK_INTERVAL_SECONDS to Memory usage extension options.
Kept the default as it was, at 60.0 seconds. But added a setting to
allow this to be changed as desired.
2015-06-06 00:39:14 +10:00
Julia Medina
367ea81e71 Remove deprecated %z formatting from the default LOG_DATEFORMAT 2015-06-04 04:11:23 +08:00
Marven Sanchez
8771d1f79b Update HTTPCache middleware docs 2015-06-01 18:20:59 +08:00
Mikhail Korobov
342cb622f1 DOC fix non-working link (by removing it).
See https://github.com/scrapy/scrapy/pull/1260
2015-05-27 23:04:58 +05:00
Pablo Hoffman
545c4224f9 update old crawlera link 2015-05-25 16:01:54 -03:00
Mikhail Korobov
cc2258b2bb Merge pull request #1145 from bosnj/master
[MRG+1] default return value for extract_first
2015-05-21 22:03:54 +05:00
Mikhail Korobov
9b0ca1b7a0 drop support for FEED_EXPORT_FIELD=[] meaning "no fields" 2015-05-18 17:13:25 +05:00
Mikhail Korobov
9fb318338b support FEED_EXPORT_FIELDS=[] 2015-05-18 16:44:02 +05:00
Mikhail Korobov
e1efd19175 TST, DOC document that Scrapy only infers field names for CSV 2015-05-18 16:43:23 +05:00