1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-23 08:24:16 +00:00

7619 Commits

Author SHA1 Message Date
Marc Hernandez Cabot
a5de2c64e6 fix W291, W292, W293 (whitespaces) 2019-12-18 16:33:46 +01:00
elacuesta
916382e109 Add errback parameter to scrapy.spiders.crawl.Rule (#4000)
* Add errback parameter to scrapy.spiders.crawl.Rule

* CrawlSpider: optimize by reducing iterations

* [test] Rule.errback

* [doc] Rule.errback

* [doc] Use autoclass in docs/topics/spiders.rst

Co-Authored-By: Adrián Chaves <adrian@chaves.io>

* Rule.process_links takes a list

* Fix aesthetic issue reported by Flake8
2019-12-18 20:05:33 +05:00
Andrey Rahmatullin
0e8ee22a88
Merge pull request #4244 from wRAR/split-EngineTest
Split a long test in test_engine.py into three.
2019-12-18 20:00:36 +05:00
Andrey Rakhmatullin
7ccb169a27 Split a long test in test_engine.py into three. 2019-12-18 19:41:16 +05:00
Mikhail Korobov
053319334c
Merge pull request #4179 from Gallaecio/user-friendlier-tox
Make tox configuration more user friendly
2019-12-18 18:35:19 +05:00
Andrey Rahmatullin
e2b5cdeb88
Merge pull request #4242 from whalebot-helmsman/single_place_for_dependencies
Remove requirements-py3.txt
2019-12-18 18:27:09 +05:00
Vostretsov Nikita
012533924a remove requirements from here too 2019-12-18 11:13:36 +00:00
Mikhail Korobov
e90f276903
Merge pull request #4198 from wRAR/deprecate-noconnect
Deprecate the HTTPS proxy noconnect mode.
2019-12-18 16:11:24 +05:00
Vostretsov Nikita
12f9ffeb5d remove requirements-py3.txt 2019-12-18 10:53:27 +00:00
Andrey Rakhmatullin
ac302c3f61 Fix a flake8 problem. 2019-12-18 15:43:05 +05:00
Andrey Rakhmatullin
bb2ff13e4c Skip Http10ProxyTestCase.test_download_with_proxy_https_noconnect 2019-12-18 15:39:08 +05:00
Andrey Rakhmatullin
2d92a39003 Restore test_download_with_proxy_https_noconnect, check for a warning there. 2019-12-18 12:07:08 +05:00
Adrián Chaves
b8cf522916
Merge branch 'master' into user-friendlier-tox 2019-12-17 16:53:08 +01:00
Andrey Rahmatullin
4be2bbfe75
Merge pull request #4239 from Apuyuseng/master
Fix mail attachs tcmime *** (#4229)
2019-12-17 19:43:30 +05:00
Andrey Rahmatullin
bb3f164280
Merge pull request #4236 from wRAR/pipeline-tests
Add simple tests for pipelines.
2019-12-17 19:38:06 +05:00
Marc Hernández
63cf5c75c8 Fix E502: backslash is redundant between brackets (#4238) 2019-12-17 13:53:15 +01:00
apu
7d0096da6e
Fix mail attachs tcmime *** (#4229)
When the file name consists of alphanumeric characters, it is normal to receive the attachment name.
However,However, problems will occur if the file name is changed to Chinese.
This has nothing to do with the file type
2019-12-17 09:47:01 +08:00
Andrey Rakhmatullin
5980d3bbff Add simple tests for pipelines. 2019-12-16 22:41:28 +05:00
Mikhail Korobov
e9b24d62c6
Merge pull request #4231 from noviluni/docs_fix
Docs fix
2019-12-16 21:56:43 +05:00
marc
a59bb279d1 add year through code 2019-12-15 17:33:00 +01:00
marc
1aab20e1ce update copyright notice year 2019-12-14 10:37:31 +01:00
marc
e3a3ad4aaf remove reference to old (Python 2.7) environment 2019-12-14 10:34:31 +01:00
Adrián Chaves
b5c4c2cae8
Keep 2 spaces between code and inline comments (#4195) 2019-12-13 14:20:48 +01:00
Andrey Rahmatullin
02cdc53fb8
Add a test for a CrawlerProcess script. (#4218)
* Add a test for a CrawlerProcess script.

* Add tests/CrawlerProcess to collect_ignore.

* Remove an extra line.

* Fix/improve conftest.py.
2019-12-13 18:04:05 +05:00
Adrián Chaves
07b8cd28aa
Mark bandit’s 402 check as addressed by #4180 (#4181) 2019-12-05 14:48:31 +01:00
Andrey Rahmatullin
076f0764b7
Merge pull request #4121 from scrapy/remove-six-code
Remove six-related code and __future__ imports
2019-12-05 18:20:14 +05:00
Adrián Chaves
d7b1c138f1
Merge branch 'master' into user-friendlier-tox 2019-12-05 14:02:24 +01:00
Mikhail Korobov
250da28952
Merge pull request #4170 from mabelvj/4133-handle-start_url
Raise error when start_url found instead of start_urls.
2019-12-05 17:47:03 +05:00
Andrey Rahmatullin
7079d12c5e
Merge pull request #4212 from grammy-jiang/fix-imports
Convert the relative imports to absolute imports
2019-12-05 12:44:22 +05:00
Andrey Rahmatullin
aaf94affee
Merge pull request #4213 from dqwerter/patch-1
Update overview.rst | Fix an inconsistency
2019-12-05 12:43:24 +05:00
Wang Qin
af624ef414
Update overview.rst | Fix an inconsistency
There exists an inconsistency between the code (line 37 - 38) and the output 'quotes.json' (line 56 - 68). 

Note that even though according to line 53 - 54  'quotes.json' is "reformatted here for better readability", it cannot explain why the "author" field precedes the "text" field. 

Intended output for the code BEFORE change: 
    [{
        "text": "\u201cThe person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.\u201d",
        "author": "Jane Austen"
    },
    {
        "text": "\u201cOutside of a dog, a book is man's best friend. Inside of a dog it's too dark to read.\u201d",
        "author": "Groucho Marx"
    },
    {
        "text": "\u201cA day without sunshine is like, you know, night.\u201d",
        "author": "Steve Martin"
    },
    ...]

Intended output for the code After change (the inconsistency is fixed): 
    [{
        "author": "Jane Austen",
        "text": "\u201cThe person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.\u201d"
    },
    {
        "author": "Groucho Marx",
        "text": "\u201cOutside of a dog, a book is man's best friend. Inside of a dog it's too dark to read.\u201d"
    },
    {
        "author": "Steve Martin",
        "text": "\u201cA day without sunshine is like, you know, night.\u201d"
    },
    ...]
2019-12-05 09:29:12 +08:00
Grammy Jiang
9b4b43f8ac
Convert the relative imports to absolute imports
This commits converts the relative imports to absolute imports in the
entire package
2019-12-05 11:25:19 +11:00
Eugenio Lacuesta
2c010152c3
Merge remote-tracking branch 'upstream/master' into remove-six-code 2019-12-04 15:43:02 -03:00
Grammy Jiang
74627033c4 Remove the used import and re-arrange the imports (#4208)
This commit removes unused import and re-arrange the imports in cookies
module
2019-12-04 14:24:14 +01:00
Grammy Jiang
702333478d Re-arrange the imports in httpcache module (#4209)
This commit re-arrange the imports in httpcache module to follow pep8
2019-12-04 14:23:28 +01:00
Grammy Jiang
5d8d4bb7d7 Re-arrange the imports in the httpproxy module (#4210)
This commit re-arranges the imports in the httpproxy module to follow
pep8
2019-12-04 14:22:10 +01:00
Mikhail Korobov
9b7452211a
Merge pull request #4099 from BurnzZ/itemloader-docs
update docs of scrapy.loader.ItemLoader.item
2019-12-03 13:14:45 +05:00
Grammy Jiang
d1cdfb4701 Use pprint.pformat on overridden settings (#4199)
Keeps consistency with scrapy.middleware
2019-11-29 09:13:57 +01:00
Mikhail Korobov
3a5b86220f
Merge pull request #4194 from Gallaecio/intersphinx
Use InterSphinx for coverage links
2019-11-28 17:36:19 +05:00
Andrey Rakhmatullin
63546cbf3e Deprecate the HTTPS proxy noconnect mode. 2019-11-27 22:42:52 +05:00
Adrián Chaves
b73fc99b60 Use InterSphinx for coverage links 2019-11-26 10:31:55 +01:00
Adrián Chaves
6d9ed6146d
Merge branch 'master' into remove-six-code 2019-11-25 10:34:21 +01:00
Andrey Rahmatullin
8a1c99676e
Merge pull request #3899 from elacuesta/py3_single_argument_processors
[Py3] Item loaders: allow single argument functions as processors
2019-11-25 13:47:58 +05:00
Eugenio Lacuesta
40b5cfc0a4
Item loaders: allow single-argument processors (unbound methods) 2019-11-22 20:47:22 -03:00
Eugenio Lacuesta
55cc5c9068
Skip pickle in bandit check 2019-11-22 12:41:31 -03:00
Eugenio Lacuesta
5bab3c0261
Merge remote-tracking branch 'upstream/master' into remove-six-code 2019-11-22 12:12:29 -03:00
Mikhail Korobov
16e0636dcf
Merge pull request #4186 from Gallaecio/lgtm
Remove unused imports
2019-11-22 12:28:29 +05:00
Adrián Chaves
9b5053c564 Undo unintended tox.ini changes 2019-11-21 22:00:34 +01:00
Mikhail Korobov
0d416c6191
Merge pull request #4185 from Gallaecio/intersphinx
Use InterSphinx for links to the pytest and tox documentation
2019-11-21 23:27:57 +05:00
Mabel Villalba
070b3a4e84
Merge branch 'master' into 4133-handle-start_url 2019-11-21 17:10:31 +01:00