mirror of
https://github.com/scrapy/scrapy.git
synced 2025-02-24 19:03:54 +00:00
236 lines
8.1 KiB
ReStructuredText
236 lines
8.1 KiB
ReStructuredText
.. _topics-logging:
|
|
|
|
=======
|
|
Logging
|
|
=======
|
|
|
|
.. note::
|
|
:mod:`scrapy.log` has been deprecated alongside its functions in favor of
|
|
explicit calls to the Python standard logging. Keep reading to learn more
|
|
about the new logging system.
|
|
|
|
Scrapy uses `Python's builtin logging system
|
|
<https://docs.python.org/2/library/logging.html>`_ for event logging. We'll
|
|
provide some simple examples to get you started, but for more advanced
|
|
use-cases it's strongly suggested to read thoroughly its documentation.
|
|
|
|
Logging works out of the box, and can be configured to some extent with the
|
|
Scrapy settings listed in :ref:`topics-logging-settings`.
|
|
|
|
Scrapy calls :func:`scrapy.utils.log.configure_logging` to set some reasonable
|
|
defaults and handle those settings in :ref:`topics-logging-settings` when
|
|
running commands, so it's recommended to manually call it if you're running
|
|
Scrapy from scripts as described in :ref:`run-from-script`.
|
|
|
|
.. _topics-logging-levels:
|
|
|
|
Log levels
|
|
==========
|
|
|
|
Python's builtin logging defines 5 different levels to indicate severity on a
|
|
given log message. Here are the standard ones, listed in decreasing order:
|
|
|
|
1. ``logging.CRITICAL`` - for critical errors (highest severity)
|
|
2. ``logging.ERROR`` - for regular errors
|
|
3. ``logging.WARNING`` - for warning messages
|
|
4. ``logging.INFO`` - for informational messages
|
|
5. ``logging.DEBUG`` - for debugging messages (lowest severity)
|
|
|
|
How to log messages
|
|
===================
|
|
|
|
Here's a quick example of how to log a message using the ``logging.WARNING``
|
|
level::
|
|
|
|
import logging
|
|
logging.warning("This is a warning")
|
|
|
|
There are shortcuts for issuing log messages on any of the standard 5 levels,
|
|
and there's also a general ``logging.log`` method which takes a given level as
|
|
argument. If you need so, last example could be rewrote as::
|
|
|
|
import logging
|
|
logging.log(logging.WARNING, "This is a warning")
|
|
|
|
On top of that, you can create different "loggers" to encapsulate messages (For
|
|
example, a common practice it's to create different loggers for every module).
|
|
These loggers can be configured independently, and they allow hierarchical
|
|
constructions.
|
|
|
|
Last examples use the root logger behind the scenes, which is a top level
|
|
logger where all messages are propagated to (unless otherwise specified). Using
|
|
``logging`` helpers is merely a shortcut for getting the root logger
|
|
explicitly, so this is also an equivalent of last snippets::
|
|
|
|
import logging
|
|
logger = logging.getLogger()
|
|
logger.warning("This is a warning")
|
|
|
|
You can use a different logger just by getting its name with the
|
|
``logging.getLogger`` function::
|
|
|
|
import logging
|
|
logger = logging.getLogger('mycustomlogger')
|
|
logger.warning("This is a warning")
|
|
|
|
Finally, you can ensure having a custom logger for any module you're working on
|
|
by using the ``__name__`` variable, which is populated with current module's
|
|
path::
|
|
|
|
import logging
|
|
logger = logging.getLogger(__name__)
|
|
logger.warning("This is a warning")
|
|
|
|
.. seealso::
|
|
|
|
Module logging, `HowTo <https://docs.python.org/2/howto/logging.html>`_
|
|
Basic Logging Tutorial
|
|
|
|
Module logging, `Loggers <https://docs.python.org/2/library/logging.html#logger-objects>`_
|
|
Further documentation on loggers
|
|
|
|
.. _topics-logging-from-spiders:
|
|
|
|
Logging from Spiders
|
|
====================
|
|
|
|
Scrapy provides a :data:`~scrapy.spider.Spider.logger` within each Spider
|
|
instance, that can be accessed and used like this::
|
|
|
|
import scrapy
|
|
|
|
class MySpider(scrapy.Spider):
|
|
|
|
name = 'myspider'
|
|
start_urls = ['http://scrapinghub.com']
|
|
|
|
def parse(self, response):
|
|
self.logger.info('Parse function called on %s', response.url)
|
|
|
|
That logger is created using the Spider's name, but you can use any custom
|
|
Python logger you want. For example::
|
|
|
|
import logging
|
|
import scrapy
|
|
|
|
logger = logging.getLogger('mycustomlogger')
|
|
|
|
class MySpider(scrapy.Spider):
|
|
|
|
name = 'myspider'
|
|
start_urls = ['http://scrapinghub.com']
|
|
|
|
def parse(self, response):
|
|
logger.info('Parse function called on %s', response.url)
|
|
|
|
.. _topics-logging-configuration:
|
|
|
|
Logging configuration
|
|
=====================
|
|
|
|
Loggers on their own don't manage how messages sent through them are displayed.
|
|
For this task, different "handlers" can be attached to any logger instance and
|
|
they will redirect those messages to appropriate destinations, such as the
|
|
standard output, files, emails, etc.
|
|
|
|
By default, Scrapy sets and configures a handler for the root logger, based on
|
|
the settings below.
|
|
|
|
.. _topics-logging-settings:
|
|
|
|
Logging settings
|
|
----------------
|
|
|
|
These settings can be used to configure the logging:
|
|
|
|
* :setting:`LOG_FILE`
|
|
* :setting:`LOG_ENABLED`
|
|
* :setting:`LOG_ENCODING`
|
|
* :setting:`LOG_LEVEL`
|
|
* :setting:`LOG_FORMAT`
|
|
* :setting:`LOG_DATEFORMAT`
|
|
* :setting:`LOG_STDOUT`
|
|
|
|
First couple of settings define a destination for log messages. If
|
|
:setting:`LOG_FILE` is set, messages sent through the root logger will be
|
|
redirected to a file named :setting:`LOG_FILE` with encoding
|
|
:setting:`LOG_ENCODING`. If unset and :setting:`LOG_ENABLED` is ``True``, log
|
|
messages will be displayed on the standard error. Lastly, if
|
|
:setting:`LOG_ENABLED` is ``False``, there won't be any visible log output.
|
|
|
|
:setting:`LOG_LEVEL` determines the minimum level of severity to display, those
|
|
messages with lower severity will be filtered out. It ranges through the
|
|
possible levels listed in :ref:`topics-logging-levels`.
|
|
|
|
:setting:`LOG_FORMAT` and :setting:`LOG_DATEFORMAT` specify formatting strings
|
|
used as layouts for all messages. Those strings can contain any placeholders
|
|
listed in `logging's logrecord attributes docs
|
|
<https://docs.python.org/2/library/logging.html#logrecord-attributes>`_ and
|
|
`datetime's strftime and strptime directives
|
|
<https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior>`_
|
|
respectively.
|
|
|
|
Command-line options
|
|
--------------------
|
|
|
|
There are command-line arguments, available for all commands, that you can use
|
|
to override some of the Scrapy settings regarding logging.
|
|
|
|
* ``--logfile FILE``
|
|
Overrides :setting:`LOG_FILE`
|
|
* ``--loglevel/-L LEVEL``
|
|
Overrides :setting:`LOG_LEVEL`
|
|
* ``--nolog``
|
|
Sets :setting:`LOG_ENABLED` to ``False``
|
|
|
|
.. seealso::
|
|
|
|
Module `logging.handlers <https://docs.python.org/2/library/logging.handlers.html>`_
|
|
Further documentation on available handlers
|
|
|
|
scrapy.utils.log module
|
|
=======================
|
|
|
|
.. module:: scrapy.utils.log
|
|
:synopsis: Logging utils
|
|
|
|
.. function:: configure_logging(settings=None)
|
|
|
|
This function initializes logging defaults for Scrapy.
|
|
|
|
It's automatically called when using Scrapy commands, but needs to be
|
|
called explicitely when running custom scripts. In that case, its usage is
|
|
not required but it's recommended.
|
|
|
|
This function does:
|
|
- Route warnings and Twisted logging through Python standard logging
|
|
- Set a filter on Scrapy logger for formatting Twisted failures
|
|
- Assign DEBUG and ERROR levels to Scrapy and Twisted loggers
|
|
respectively
|
|
|
|
If `settings` is not ``None``, it will also create a root handler based on
|
|
the settings listed in :ref:`topics-logging-settings`.
|
|
|
|
If you plan on configuring the handlers yourself is still recommended you
|
|
call this function, keeping `settings` as ``None``. Bear in mind there
|
|
won't be any log output set by default in that case.
|
|
|
|
To get you started on manually configuring logging's output, you can use
|
|
`logging.basicConfig()`_ to set a basic root handler. This is an example on
|
|
how to redirect ``INFO`` or higher messages to a file::
|
|
|
|
import logging
|
|
from scrapy.utils.log import configure_logging
|
|
|
|
configure_logging() # Note we aren't providing settings in this case
|
|
logging.basicConfig(filename='log.txt', format='%(levelname)s: %(message)s', level=logging.INFO)
|
|
|
|
Refer to :ref:`run-from-script` for more details about using Scrapy this
|
|
way.
|
|
|
|
:param settings: settings used to create and configure a handler for the
|
|
root logger.
|
|
:type settings: :class:`~scrapy.settings.Settings` object or ``None``
|
|
|
|
.. _logging.basicConfig(): https://docs.python.org/2/library/logging.html#logging.basicConfig
|