1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-23 18:23:51 +00:00

7297 Commits

Author SHA1 Message Date
Eugenio Lacuesta
97a7d775f7
Aplly suggestions by @victor-torres 2019-08-29 10:51:16 -03:00
Eugenio Lacuesta
b6b76df057
CallbackKeywordArgumentsContract 2019-08-29 03:29:24 -03:00
Mikhail Korobov
ede91478e5
Merge pull request #3966 from anubhavp28/robotstxt_useragent
Adds ROBOTSTXT_USER_AGENT setting
2019-08-28 22:00:17 +05:00
Mikhail Korobov
93d4b0b0d7
Merge pull request #3973 from Gallaecio/documentation-coverage
Provide complete API documentation coverage of scrapy.exporters
2019-08-28 21:59:10 +05:00
Mikhail Korobov
b00f81c52b
Merge pull request #3978 from anubhavp28/doc-link-fix
Fixes a link in docs
2019-08-28 21:58:25 +05:00
Anubhav Patel
77c8ab2e62 makes suggested changes 2019-08-27 18:44:08 +05:30
Anubhav Patel
ad824a264b fixes a link in doc 2019-08-27 18:30:11 +05:30
Anubhav Patel
3a7b949d6d Adds integration with Protego robots.txt parser (#3935) 2019-08-27 09:41:31 +02:00
elacuesta
3abe7e6e6d Add Bug report and Feature request templates (#3471) 2019-08-26 09:35:44 +02:00
Adrián Chaves
0fa384e80d Provide complete API documentation coverage of scrapy.exporters 2019-08-22 20:10:42 +02:00
Mikhail Korobov
56948c446c
Merge pull request #3442 from wRAR/ciphers
Support for overriding OpenSSL ciphers
2019-08-19 23:15:09 +05:00
Anubhav Patel
00fe05e536 adds ROBOTSTXT_USER_AGENT setting 2019-08-19 09:24:16 +05:30
Andrey Rakhmatullin
aaa5229e5d Fixes and improvements for DOWNLOADER_CLIENT_TLS_CIPHERS. 2019-08-13 16:56:26 +05:00
Andrey Rakhmatullin
9a8edf2bf1 Tests for setting SSL ciphers. 2019-08-13 16:53:19 +05:00
Andrey Rakhmatullin
ce281d890d Documentation for DOWNLOADER_CLIENT_TLS_CIPHERS. 2019-08-13 16:53:19 +05:00
Andrey Rakhmatullin
3384db92b4 Add support for setting SSL ciphers. 2019-08-13 16:53:19 +05:00
Mikhail Korobov
a95de71d8e
Merge pull request #3950 from elacuesta/version_updates
Remove obsolete version checks
2019-08-12 22:51:00 +05:00
Mikhail Korobov
2f0c46e762
Merge pull request #3946 from elacuesta/simplify_versions
[MRG+1] Simplify version reporting
2019-08-10 14:25:59 +05:00
Eugenio Lacuesta
26fb28b20f
PEP8-ify HTTP/1.1 downloader handler
Signed-off-by: Eugenio Lacuesta <eugenio.lacuesta@gmail.com>
2019-08-09 00:56:24 -03:00
Eugenio Lacuesta
d5dcc5eaef
Import twisted.web.client.URI directly 2019-08-09 00:30:58 -03:00
Eugenio Lacuesta
d3737d869b
Remove check for Twisted>=14.0 2019-08-09 00:21:43 -03:00
Eugenio Lacuesta
e17c9a48fd
Remove check for Twisted>=15.0.0
16.0.0 is currently the minimum supported version
2019-08-08 23:59:17 -03:00
Eugenio Lacuesta
d92f1b1858
Simplify import + assignment 2019-08-08 23:53:35 -03:00
Eugenio Lacuesta
b404941e0d
Remove import check for service_identity
service_identity.exceptions.CertificateError is available in the current minimum version (16.0.0)
2019-08-08 23:18:47 -03:00
Eugenio Lacuesta
3164543ed1
Remove fallback ScrapyClientContextFactory class (used in Twisted < 14.0.0)
16.0.0 is currently the minimum supported version
2019-08-08 23:15:03 -03:00
Eugenio Lacuesta
a940a80f58
Remove check for pyOpenSSL>=0.16
16.2.0 is currently the minimum supported version
2019-08-08 23:05:52 -03:00
Eugenio Lacuesta
fa9a9033f0
Remove check for Twisted>=14.0.0
16.0.0 is currently the minimum supported version
2019-08-08 22:58:59 -03:00
Eugenio Lacuesta
da385b56b1
Move get_openssl_version function to scrapy.utils.ssl 2019-08-08 12:44:23 -03:00
Shivam Sandbhor
3040f77468 [MRG+1] Update project.py removed one 'hack', seems irrelevant. (#3910)
* Update project.py removed one 'hack', seems irrelevant.

As mentioned by @Gallaecio in issue #3871, the 'hack'  is cleared. I also  double checked whether the environment variable "SCRAPY_PICKLED_SETTINGS_TO_OVERRIDE" was ever set in our codebase and it turns out we didn't set it or used it anywhere else.So I guess the 'hack' was not used in the current version. Also the name of this environment variable rather doesn't suggest it was  a boolean(it is used in  an 'if' condition which has perplexed me )

* Update project.py

* Update project.py

How about this?

* Update project.py

* Update project.py

* Update scrapy/utils/project.py

Co-Authored-By: Adrián Chaves <adrian@chaves.io>

* Update scrapy/utils/project.py

Co-Authored-By: Adrián Chaves <adrian@chaves.io>

* Update project.py
2019-08-08 16:58:22 +05:00
Mikhail Korobov
73d1b4b748
Merge pull request #3939 from Gallaecio/improve-scrapes-contract
Report all missing fields when a scrapes contract fails
2019-08-08 13:49:09 +05:00
Mikhail Korobov
b4556d6508
Merge pull request #3941 from starrify/ftp-storage-uri-unquote
[MRG+1] Added: Properly handling quoted passwords in FEED_URI for FTP
2019-08-08 13:38:17 +05:00
Adrián Chaves
9119798a5c Add test coverage for contract failures involving multiple missing fields 2019-08-08 09:52:03 +02:00
Marc Hernández
d76b6944c9 Create Request from curl command (#3862) 2019-08-08 09:43:42 +02:00
Eugenio Lacuesta
595c995ee6
Simplify version reporting 2019-08-07 15:38:04 -03:00
elacuesta
5dbeece8da [MRG+1] Drop py34 support - Update CI envs (#3892)
* Drop py34 support

* Travis experiments

* More Travis experiments

* Bump Twisted version for py35+ (stretch)

* Remove Debian build

* Remove pinned lxml for Py34

* Fix merge error

* Remove unused tox env

* Add environment with pinned versions for py36

* Bump minimum Twisted version in py27; Envs with pinned versions for py27 and py35

* Add botocore as extra dep for py27 tests

* Update requirements-py2.txt

* Add botocore and Pillow as extra dependencies
2019-08-07 12:36:52 +05:00
Pengyu Chen
7b755a41a1
Added: Properly handling quoted passwords in FEED_URI for FTP 2019-08-06 15:19:57 +01:00
Adrián Chaves
bff335cf7f Improve the error message in contract failures due to multiple missing fields 2019-08-05 15:47:58 +02:00
tpeng
a8621bbc29 show all the missing field when scrapes contract fails 2019-08-05 15:27:31 +02:00
Daniel Graña
eef1732374
Merge pull request #3934 from anubhavp28/typo-fix
[MRG+1] Fixes typo
2019-08-05 09:39:21 -03:00
Anubhav Patel
9a4cd94244 fixes typo 2019-08-03 22:46:06 +05:30
Anubhav Patel
8e813953bd [MRG+1] [GSoC 2019] Interface for robots.txt parsers (#3796)
Make the robots.txt parser configurable through the new ROBOTSTXT_PARSER setting, support the Reppy and Robotexclusionrulesparser parsers, and allow implementing custom robots.txt parsers.
2019-08-02 09:43:29 +02:00
Adrián Chaves
a12e8251e0 Cover Scrapy 1.7.3 in the release notes 2019-08-01 17:10:31 +02:00
sbs2001
783d61d32a [MRG+1] Update _monkeypatches.py (#3907)
* Update _monkeypatches.py

The workarounds are not required assuming the bugs regarding urlparse are absent in  Python versions >2.7. We already exit the program if Python  version<2.7 in the __init__.py(line 17).The monkeypatches are deployed after this check at line 27  in  the __init__.py .

* Update _monkeypatches.py

Added the second workaround.

* Update _monkeypatches.py

* Update _monkeypatches.py
2019-08-01 13:41:27 +05:00
Mikhail Korobov
cdf7889ada
Merge pull request #3923 from scrapy/pin-build-environment
Pin Travis-ci build environment to previous default: Trusty
2019-07-31 20:07:44 +05:00
Renne Rocha
a25e09ecdd Added constrain on lxml version based on Python version 2019-07-30 23:24:41 -03:00
Daniel Graña
7333fc02aa Pin Travis-ci build environment to previous default: Trusty
Travis-ci changed the default build environment to Xenial as explained in https://blog.travis-ci.com/2019-04-15-xenial-default-build-environment
This causes builds meant for Debian Jessie to break as noted by @wRAR in https://github.com/scrapy/scrapy/issues/3917#issuecomment-516426389

This change pins the environment to known working ubuntu trusty distribution prior to dropping Jessie support and upgrade to Xenial as base.

Closes #1369
2019-07-30 23:16:11 -03:00
Daniel Graña
b01d012b5a
Merge pull request #3920 from wRAR/fix-tls-logging
Fix memory handling and error handling in utils.ssl.get_temp_key_info.
2019-07-30 22:15:26 -03:00
Andrey Rakhmatullin
f21dc24a26 Fix memory handling and error handling in utils.ssl.get_temp_key_info. 2019-07-30 18:16:12 +05:00
Adrián Chaves
06c093f43f
Merge pull request #3905 from lucywang000/0.001
s3 file store persist_file should accept all supported headers
2019-07-29 19:02:14 +02:00
Adrián Chaves
04bca6af7c
Merge pull request #3894 from KristobalJunta/fix_retry_docs
fix default RETRY_HTTP_CODES value in docs
2019-07-29 18:20:55 +02:00