Merge pull request #3053 from scrapy/release-notes-1.5

Release notes for the upcoming 1.5.0 version
2025-02-24 12:44:06 +00:00 · 2017-12-29 11:57:04 -03:00 · 2017-12-29 11:57:04 -03:00 · 9b4d6a40a6
commit 9b4d6a40a6
parent a1cc5a63d3 c107059ef8
1 changed files with 112 additions and 0 deletions
--- a/docs/news.rst
+++ b/docs/news.rst
@ -3,6 +3,118 @@
 Release notes
 =============

+Scrapy 1.5.0 (2017-XX-XX)
+-------------------------
+
+This release brings small new features and improvements across the codebase.
+Some highlights:
+
+* Google Cloud Storage is supported in FilesPipeline and ImagesPipeline.
+* Crawling with proxy servers becomes more efficient, as connections
+  to proxies can be reused now.
+* Warnings, exception and logging messages are improved to make debugging
+  easier.
+* ``scrapy parse`` command now allows to set custom request meta via
+  ``--meta`` argument.
+* Compatibility with Python 3.6, PyPy and PyPy3 is improved;
+  PyPy and PyPy3 are now supported officially, by running tests on CI.
+* Better default handling of HTTP 308, 522 and 524 status codes.
+* Documentation is improved, as usual.
+
+Backwards Incompatible Changes
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+* Scrapy 1.5 drops support for Python 3.3.
+* Default Scrapy User-Agent now uses https link to scrapy.org (:issue:`2983`).
+  **This is technically backwards-incompatible**; override
+  :setting:`USER_AGENT` if you relied on old value.
+* Logging of settings overridden by ``custom_settings`` is fixed;
+  **this is technically backwards-incompatible** because the logger
+  changes from ``[scrapy.utils.log]`` to ``[scrapy.crawler]``. If you're
+  parsing Scrapy logs, please update your log parsers (:issue:`1343`).
+* LinkExtractor now ignores ``m4v`` extension by default, this is change
+  in behavior.
+* 522 and 524 status codes are added to ``RETRY_HTTP_CODES`` (:issue:`2851`)
+
+New features
+~~~~~~~~~~~~
+
+- Support ``<link>`` tags in ``Response.follow`` (:issue:`2785`)
+- Support for ``ptpython`` REPL (:issue:`2654`)
+- Google Cloud Storage support for FilesPipeline and ImagesPipeline
+  (:issue:`2923`).
+- New ``--meta`` option of the "scrapy parse" command allows to pass additional
+  request.meta (:issue:`2883`)
+- Populate spider variable when using ``shell.inspect_response`` (:issue:`2812`)
+- Handle HTTP 308 Permanent Redirect (:issue:`2844`)
+- Add 522 and 524 to ``RETRY_HTTP_CODES`` (:issue:`2851`)
+- Log versions information at startup (:issue:`2857`)
+- ``scrapy.mail.MailSender`` now works in Python 3 (it requires Twisted 17.9.0)
+- Connections to proxy servers are reused (:issue:`2743`)
+- Add template for a downloader middleware (:issue:`2755`)
+- Explicit message for NotImplementedError when parse callback not defined
+  (:issue:`2831`)
+- CrawlerProcess got an option to disable installation of root log handler
+  (:issue:`2921`)
+- LinkExtractor now ignores ``m4v`` extension by default
+- Better log messages for responses over :setting:`DOWNLOAD_WARNSIZE` and
+  :setting:`DOWNLOAD_MAXSIZE` limits (:issue:`2927`)
+- Show warning when a URL is put to ``Spider.allowed_domains`` instead of
+  a domain (:issue:`2250`).
+
+Bug fixes
+~~~~~~~~~
+
+- Fix logging of settings overridden by ``custom_settings``;
+  **this is technically backwards-incompatible** because the logger
+  changes from ``[scrapy.utils.log]`` to ``[scrapy.crawler]``, so please
+  update your log parsers if needed (:issue:`1343`)
+- Default Scrapy User-Agent now uses https link to scrapy.org (:issue:`2983`).
+  **This is technically backwards-incompatible**; override
+  :setting:`USER_AGENT` if you relied on old value.
+- Fix PyPy and PyPy3 test failures, support them officially
+  (:issue:`2793`, :issue:`2935`, :issue:`2990`, :issue:`3050`, :issue:`2213`,
+  :issue:`3048`)
+- Fix DNS resolver when ``DNSCACHE_ENABLED=False`` (:issue:`2811`)
+- Add ``cryptography`` for Debian Jessie tox test env (:issue:`2848`)
+- Add verification to check if Request callback is callable (:issue:`2766`)
+- Port ``extras/qpsclient.py`` to Python 3 (:issue:`2849`)
+- Use getfullargspec under the scenes for Python 3 to stop DeprecationWarning
+  (:issue:`2862`)
+- Update deprecated test aliases (:issue:`2876`)
+- Fix ``SitemapSpider`` support for alternate links (:issue:`2853`)
+
+Docs
+~~~~
+
+- Added missing bullet point for the ``AUTOTHROTTLE_TARGET_CONCURRENCY``
+  setting. (:issue:`2756`)
+- Update Contributing docs, document new support channels
+  (:issue:`2762`, issue:`3038`)
+- Include references to Scrapy subreddit in the docs
+- Fix broken links; use https:// for external links
+  (:issue:`2978`, :issue:`2982`, :issue:`2958`)
+- Document CloseSpider extension better (:issue:`2759`)
+- Use ``pymongo.collection.Collection.insert_one()`` in MongoDB example
+  (:issue:`2781`)
+- Spelling mistake and typos
+  (:issue:`2828`, :issue:`2837`, :issue:`#2884`, :issue:`2924`)
+- Clarify ``CSVFeedSpider.headers`` documentation (:issue:`2826`)
+- Document ``DontCloseSpider`` exception and clarify ``spider_idle``
+  (:issue:`2791`)
+- Update "Releases" section in README (:issue:`2764`)
+- Fix rst syntax in ``DOWNLOAD_FAIL_ON_DATALOSS`` docs (:issue:`2763`)
+- Small fix in description of startproject arguments (:issue:`2866`)
+- Clarify data types in Response.body docs (:issue:`2922`)
+- Add a note about ``request.meta['depth']`` to DepthMiddleware docs (:issue:`2374`)
+- Add a note about ``request.meta['dont_merge_cookies']`` to CookiesMiddleware
+  docs (:issue:`2999`)
+- Up-to-date example of project structure (:issue:`2964`, :issue:`2976`)
+- A better example of ItemExporters usage (:issue:`2989`)
+- Document ``from_crawler`` methods for spider and downloader middlewares
+  (:issue:`3019`)
+
+
 Scrapy 1.4.0 (2017-05-18)
 -------------------------