1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-24 13:23:59 +00:00

768 Commits

Author SHA1 Message Date
John O'Connor
d85da273be Fix typo in media pipeline docs 2016-10-28 19:44:46 -07:00
Artur Gaspar
9ce9a293a6 Always check robots.txt before making another request in RobotsTxtMiddleware. 2015-09-02 10:23:24 -03:00
Artur Gaspar
ca83a0b028 Support for returning deferreds in downloader middleware methods. 2015-09-01 13:22:43 -03:00
David Tagatac
08162a15d8 minor: scrapy.Spider docs grammar 2015-08-27 17:37:16 -04:00
Mikhail Korobov
9616d91e4a Merge pull request #1444 from cyberplant/bpython_support
[MRG +1] bpython support
2015-08-27 21:28:05 +05:00
Mikhail Korobov
cfae62f9cc Merge pull request #1441 from aivarsk/fix-common-practices
Make common practices sample code match the comments
2015-08-23 17:36:09 +05:00
Jakob de Maeyer
d164398a27 Fix RedirectMiddleware not honouring meta handle_httpstatus keys 2015-08-21 13:22:42 +02:00
nyov
509cc8d41e Add support for bpython console.
Adds support for configuration of shells from scrapy.cfg
and SCRAPY_PYTHON_SHELL.

config snippet:

cat <<EOF >> ~/.scrapy.cfg
[settings]
# shell can be one of ipython, bpython or python;
# to be tried as the interactive python console
# (in above order, unless set here).
shell = python
EOF

(closes #270, #1100, #1301)
2015-08-21 01:12:58 +01:00
Aivars Kalvāns
b8b1e8e544 Make common practices sample code match the comments 2015-08-19 16:54:10 +03:00
Daniel Graña
5e6c492967 Merge pull request #1364 from jdemaeyer/enhancement/spider-handles-redirects
[MRG+1] Make RedirectMiddleware respect Spider.handle_httpstatus_list
2015-08-02 23:00:00 -03:00
David Tagatac
08123207c5 minor: scrapy.Spider grammar and clarity 2015-07-31 17:01:59 -04:00
Jakob de Maeyer
c908d31660 Make RedirectMiddleware respect Spider.handle_httpstatus_list 2015-07-16 12:50:26 +02:00
Julia Medina
d706310d8b Merge pull request #1151 from marven/cache-control
[MRG+1] RFC2616 policy enhancements + tests
2015-07-11 08:06:20 -03:00
Julia Medina
8b3ca4f250 Merge pull request #1302 from eliasdorneles/improving-access-settings-docs
[MRG+1] Improvements for docs on how to access settings
2015-07-03 00:56:32 -03:00
Daniel Graña
3fc4e0b319 Merge pull request #1282 from otherchirps/memusage-check-interval
[MRG+1] Added MEMUSAGE_CHECK_INTERVAL_SECONDS to Memory usage extension options.
2015-07-02 13:50:55 -03:00
Mikhail Korobov
d850238c22 add AUTOTHROTTLE_TARGET_CONCURRENCY option and expand AutoThrottle docs 2015-06-27 04:59:42 +05:00
Mikhail Korobov
63317531f9 DOC fix authrottle docs
see https://github.com/scrapy/scrapy/pull/502/files#r8574692
2015-06-26 20:47:58 +05:00
Pablo Hoffman
38e5bfb61c remove version suffix from ubuntu package 2015-06-22 10:57:24 -03:00
Elias Dorneles
2de5c66058 improvements for docs on how to access settings 2015-06-15 13:07:55 -03:00
Julia Medina
fa1c25c840 Merge pull request #1286 from scrapy/configure_logging
configure_logging: change the meaning of settings=None
2015-06-12 13:22:42 -03:00
Julia Medina
36bc912cdd DOC indent additional docs for configure_logging 2015-06-12 13:00:31 -03:00
Bryan Crowe
6a4c475e87 Fix a couple typos 2015-06-11 19:47:30 -04:00
Daniel Graña
5bd0395be4 Merge pull request #1291 from scrapy/signalmanager-docstrings
DOC SignalManager docstrings. See GH-713.
2015-06-10 16:28:35 -03:00
Mikhail Korobov
6c9daf3a95 DOC remove unnecessary links; fix references in send_catch_log_deferred docstring 2015-06-10 01:44:19 +05:00
Mikhail Korobov
a611f8dd2d DOC remove FailureFormatter mentions, stop copy-pasting configure_logging docstring 2015-06-09 22:57:18 +05:00
Mikhail Korobov
790c67b643 DOC spider_error doesn't support deferreds 2015-06-09 02:20:10 +05:00
Mikhail Korobov
1740fcf1a6 DOC SignalManager docstrings. See GH-713.
This change is not 100% backwards compatible because of *args changes.
Their usage was not documented, so we're not breaking public interface.
2015-06-08 21:05:58 +05:00
Mikhail Korobov
9a787893e3 (backwards-incompatible) allow to pass settings=None to configure_logging
* use explicit argument for disabling root handler;
* handle LOG_STDOUT even if install_root_handler is False
2015-06-08 19:54:18 +05:00
Chris Nilsson
0c532baf4c Removed typo, and clarified time unit of setting 2015-06-06 11:18:13 +10:00
Mikhail Korobov
d047665c02 make "settings" argument optional for Crawler, CrawlerRunner and CrawlerProcess 2015-06-06 03:23:13 +05:00
Chris Nilsson
eae25a04d9 Added MEMUSAGE_CHECK_INTERVAL_SECONDS to Memory usage extension options.
Kept the default as it was, at 60.0 seconds. But added a setting to
allow this to be changed as desired.
2015-06-06 00:39:14 +10:00
Julia Medina
367ea81e71 Remove deprecated %z formatting from the default LOG_DATEFORMAT 2015-06-04 04:11:23 +08:00
Marven Sanchez
8771d1f79b Update HTTPCache middleware docs 2015-06-01 18:20:59 +08:00
Mikhail Korobov
342cb622f1 DOC fix non-working link (by removing it).
See https://github.com/scrapy/scrapy/pull/1260
2015-05-27 23:04:58 +05:00
Pablo Hoffman
545c4224f9 update old crawlera link 2015-05-25 16:01:54 -03:00
Mikhail Korobov
cc2258b2bb Merge pull request #1145 from bosnj/master
[MRG+1] default return value for extract_first
2015-05-21 22:03:54 +05:00
Mikhail Korobov
9b0ca1b7a0 drop support for FEED_EXPORT_FIELD=[] meaning "no fields" 2015-05-18 17:13:25 +05:00
Mikhail Korobov
9fb318338b support FEED_EXPORT_FIELDS=[] 2015-05-18 16:44:02 +05:00
Mikhail Korobov
e1efd19175 TST, DOC document that Scrapy only infers field names for CSV 2015-05-18 16:43:23 +05:00
Julia Medina
7c61bd897c Fix in docs for error introduced in #1218 2015-05-14 20:09:19 -03:00
Daniel Graña
1195f906d6 Merge pull request #1218 from Curita/move-base-to-packages
Move base classes to their packages
2015-05-14 18:28:17 -03:00
Pablo Hoffman
65aa9ccc70 Merge pull request #1220 from eliasdorneles/building-settingslist-from-docs
Building settings list from docs
2015-05-13 14:43:19 -03:00
Elias Dorneles
5753e498bf fixes referencing, and list only settings not documented in current document 2015-05-09 16:15:06 -03:00
Julia Medina
c271d8f0b1 Title underline too short in docs/topics/selectors.rst 2015-05-09 05:20:54 -03:00
Julia Medina
6fd7d85448 Wrong bullet list indentation in docs/topics/media-pipeline.rst 2015-05-09 05:19:15 -03:00
Julia Medina
acc13c9821 Delete tab used as indentation in docs/topics/loaders.rst 2015-05-09 05:15:17 -03:00
Julia Medina
819a8eceee Mark as orphan the doc topics not listed in the index 2015-05-09 05:12:35 -03:00
Julia Medina
d3f576a816 Move scrapy/spider.py to scrapy/spiders/__init__.py 2015-05-09 04:20:09 -03:00
Julia Medina
d72536688f Move scrapy/linkextractor.py to scrapy/linkextractors/__init__.py 2015-05-09 03:28:37 -03:00
bosnj
8ae05478be added docs and test case, fixed handling empty string vs None 2015-05-04 21:22:17 +02:00