1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-06 08:33:33 +00:00

10711 Commits

Author SHA1 Message Date
Andrey Rakhmatullin
02ad6bd1f6
Merge pull request #6656 from Laerte/master
docs: Remove AjaxCrawlMiddleware mention from built-in downloader middleware
2025-02-05 21:05:54 +04:00
Laerte Pereira
2eb3c75c69 Remove AjaxCrawlMiddleware mention from built-in downloader middleware 2025-02-05 13:16:51 -03:00
Andrey Rakhmatullin
9fd08a92a8
Merge pull request #6655 from Laerte/master
Remove deprecated signals
2025-02-05 16:49:56 +04:00
Laerte Pereira
9d35428770 Remove deprecated signals 2025-02-05 06:48:56 -03:00
Andrey Rakhmatullin
cc480680d7
Merge pull request #6650 from wRAR/cleanup-cache-tests
Remove a duplicate test.
2025-02-03 23:33:07 +04:00
Andrey Rakhmatullin
ba5df629a2
Refactor downloader tests (#6647)
* Make download handler test base classes abstract.

* Small cleanup.

* Don't run the full test suite for special HTTP cases.

* Don't run tests in imported base classes.

* Remove an obsolete service_identity check.

* Move FTP imports back to the top level.

* Simplify the H2DownloadHandler import.

* Forbig pytest 8.2.x.

* Revert "Simplify the H2DownloadHandler import."

This reverts commit ed187046ac53c395c7423c0f5e6fb2bc7c27838f.
2025-02-03 20:11:47 +05:00
Andrey Rakhmatullin
16e39661e9
Merge pull request #6651 from wRAR/deprecate-ajaxcrawl
Deprecate AjaxCrawlMiddleware and stop calling escape_ajax() by default
2025-02-03 13:57:46 +04:00
Andrey Rakhmatullin
76a8badd24 Add a deprecation notice to the AjaxCrawlMiddleware docs. 2025-02-03 14:55:10 +05:00
Andrey Rakhmatullin
4842bcbf1d Deprecate and disable escape_ajax(). 2025-02-03 14:08:05 +05:00
Andrey Rakhmatullin
393ff96e45 Deprecate AjaxCrawlMiddleware. 2025-02-03 14:08:05 +05:00
Mikhail Korobov
b4c2531021
Merge pull request #6648 from wRAR/mockserver-calls
Optimize mockserver calls
2025-02-03 14:06:35 +05:00
Mikhail Korobov
c727c6f201
Merge pull request #6646 from wRAR/engine-tests-cleanup
Refactor EngineTest tests.
2025-02-03 13:56:47 +05:00
Andrey Rakhmatullin
df688910e0 Remove a duplicate test. 2025-02-02 18:48:26 +05:00
Andrey Rakhmatullin
783b98deda Make mockserver instances per-class. 2025-02-02 14:10:09 +05:00
Andrey Rakhmatullin
1a0dfbd32e Reuse mockserver instances in test_feedexport.py. 2025-02-02 13:28:34 +05:00
Andrey Rakhmatullin
200d76afa9 Refactor EngineTest tests. 2025-02-01 16:07:55 +05:00
Andrey Rakhmatullin
340819eff0
Merge pull request #6635 from wRAR/webclient-cleanup
Remove scrapy.core.downloader.webclient._parse().
2025-01-28 23:08:06 +04:00
Andrey Rakhmatullin
0a80871c3a Remove scrapy.core.downloader.webclient._parse(). 2025-01-28 22:22:09 +05:00
Andrey Rakhmatullin
a8d9746f56
Merge pull request #6634 from wRAR/deprecate-http10
Deprecate HTTP/1.0 code.
2025-01-28 13:29:25 +04:00
Andrey Rakhmatullin
0d2d2892ba Silence the readBody warning. 2025-01-28 02:08:49 +05:00
Andrey Rakhmatullin
bc1aeeefc9 Deprecate overriding ScrapyClientContextFactory.getContext(). 2025-01-28 02:07:47 +05:00
Andrey Rakhmatullin
16b998f9ca Sort out webclient tests. 2025-01-28 01:35:04 +05:00
Andrey Rakhmatullin
d27c6b46b1 Deprecate HTTP/1.0 support. 2025-01-27 21:25:47 +05:00
Lidiane T
98a57e2418
Fix error when running scrapy bench (#6633) 2025-01-27 11:21:30 +01:00
Andrey Rakhmatullin
cec0aeca58
Bump ruff, switch from black to ruff-format (#6631) 2025-01-27 11:07:09 +01:00
anubhav
c03fb2abb8
fix: added feed_options as a keyword argument to GCSFeedStorage. (#6628) 2025-01-23 21:06:45 +05:00
Andrey Rakhmatullin
d4b152bbf6
Drop PyPy 3.9, add a pypy3-extra-deps CI job. (#6613) 2025-01-23 09:22:18 +01:00
Rotzbua
7e61ff3524
Upgrade Sphinx (#6624) 2025-01-22 18:09:42 +01:00
Andrey Rakhmatullin
499b6c66b4
Merge pull request #6626 from Laerte/master
fix: test_s3_export fails with boto3 >= 1.36.0
2025-01-22 15:14:33 +04:00
guillermo-bondonno
9bc0029d27
Allow updating pre-crawler settings from add-ons (#6568) 2025-01-22 12:07:44 +01:00
Laerte Pereira
14219b1fca fix: test_s3_export fails with boto3 >= 1.36.0 2025-01-22 07:16:22 -03:00
Rotzbua
e0c828b7f6
chore(docs): refactor config (#6623) 2025-01-20 12:18:30 +01:00
Andrey Rakhmatullin
8bc8f752e6
Merge pull request #6622 from Rotzbua/fix_docs_broken_link
fix(docs): pillow domain is shut down permanently
2025-01-19 20:33:00 +04:00
Rotzbua
ee4f527f47
fix(docs): pillow domain is shut down permanently
See https://github.com/python-pillow/Pillow/issues/8585
2025-01-19 14:58:02 +01:00
Andrey Rakhmatullin
782e286ccf
Merge pull request #6621 from Rotzbua/chore_docs_rtd_template
chore(docs): migrate to RTD template v3
2025-01-19 17:35:43 +04:00
Rotzbua
d7168577b8
chore(docs): migrate to RTD template v3
notable change: Drop support for all versions of Internet Explorer.
2025-01-19 13:50:53 +01:00
anubhav
ca345a3b73
Flexible severity of logging level when items are dropped (#6608) 2025-01-15 11:08:18 +01:00
anubhav
1c1e83895c
Fix _pop_command_name (#6606) 2025-01-14 16:40:24 +01:00
Adrián Chaves
98ba61256d
Deprecate BaseDupeFilter.log() and improve dupefilter docs (#4151)
* Remove BaseDupeFilter.log()

It is never called because request_seen() always returns False

* Document the interface of DUPEFILTER_CLASS classes

* Remove unnecessary BaseDupeFilter comments and add a short class description

* Improve the documentation related to the DUPEFILTER_CLASS setting

* Deprecate BaseDupeFilter.log

* Update the docs

* Fix the new code example

* Remove typing to keep the example short

Otherwise, it would have required yet another import line (from __future__ or typing).

---------

Co-authored-by: Andrey Rakhmatullin <wrar@wrar.name>
2025-01-14 19:36:56 +05:00
Ionut-Cezar Ciubotariu
402500b164
Change unknown cmd message when outside project (#3426)
* Change unknown cmd message when outside project

* Simplification.

* Move the import to the top level.

* Reword the message.

---------

Co-authored-by: Andrey Rakhmatullin <wrar@wrar.name>
2025-01-10 23:08:27 +05:00
Kevin Lloyd Bernal
1fc91bb462
new allow_offsite parameter in OffsiteMiddleware (#6151)
* new 'allow_offsite' parameter in OffsiteMiddleware

* document deprecated dont_filter flag in OffsiteMiddleware

* avoid deprecating dont_filter in OffsiteMiddleware

* Copy the code to the downloader mw.

* Add tests for allow_offsite in the downloader mw.

* Mark allow_offsite with reqmeta.

---------

Co-authored-by: Andrey Rakhmatullin <wrar@wrar.name>
2025-01-08 21:28:51 +05:00
Adrián Chaves
b6d69e3895
Merge pull request #6610 from wRAR/coverage-subprocess
Fix and speed up subprocess coverage tracking
2025-01-07 18:29:31 +01:00
Andrey Rakhmatullin
3154b08e90 Improve coverage speed on Python 3.12+. 2025-01-07 20:42:03 +05:00
Andrey Rakhmatullin
7dfbecd392 Fix tracking of coverage in subprocesses. 2025-01-07 20:41:48 +05:00
Andrey Rakhmatullin
59fcb9b93c
Improve internal refs to scrapy.Request and scrapy.Selector (#6526)
* Improve internal refs to scrapy.Selector.

* Improve internal refs to scrapy.Request.

* More scrapy.http fixes.

* Fix FormRequest refs.

* More fixes.

* Simplifications.

* Last fixes.

* Add the parsel intersphinx.
2025-01-07 16:18:18 +05:00
Andrey Rakhmatullin
5d3aa80ad1
Switch CI to codecov/codecov-action and enable it on Windows. (#6609) 2025-01-07 10:52:26 +01:00
Andrey Rakhmatullin
4869315d10
Install libjpeg-dev on pinned envs to be able to install Pillow. (#6607) 2025-01-07 10:46:12 +01:00
Laerte Pereira
f2234c5b96
Fix Crawler.request_fingerprinter typing (#6605) 2025-01-07 10:40:49 +01:00
Andrey Rakhmatullin
4d31277bc6
Explicitly mark re-exports. (#6579) 2025-01-02 23:48:14 +01:00
Andrey Rakhmatullin
c330a399dc
Merge pull request #6601 from wRAR/ruff-rules-5
Add more Ruff rules, do some pylint cleanups
2025-01-02 17:38:15 +04:00