1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-24 10:03:55 +00:00

7 Commits

Author SHA1 Message Date
Mikhail Korobov
44bfcbcf0f TST split LinkExtractorTestCase.test_extraction into several methods; remove duplicated test 2015-08-31 00:49:38 +05:00
Mikhail Korobov
9bfe6ece59 Merge branch 'master' into py3-linkextractors
Conflicts:
	scrapy/linkextractors/lxmlhtml.py
	tests/test_linkextractors.py
2015-08-28 04:53:32 +05:00
Mikhail Korobov
f2edbd05de PY3 port LinkExtractor
* tests for other link extractors are moved to test_linkextractors_deprecated.py
* in Python 3 Link is converted to use native strings for urls
* minor cleanups
2015-08-28 04:11:30 +05:00
Mikhail Korobov
f46a450080 refactor test_linkextractors
* rename LinkExtractorTestCase to BaseSgmlLinkExtractorTestCase
* add BaseLinkExtractorTestCase link extractor tests can inherit from
  and decouple it from SgmlLinkExtractor
* add an extra check for deny_extensions
* xfail test_restrict_xpaths_with_html_entities for LxmlLinkExtractor explicitly
2015-08-28 04:11:30 +05:00
Rafał Gutkowski
cb3007c066 support link rel attribute with multiple values 2015-08-27 20:13:47 +02:00
Andrew Scorpil
de15fcdf33 [LinkExtractors] Ignore bogus links
(rebased the code for scrapy 1.0 and made a few code improvements --nyov)
2015-08-15 00:16:39 +00:00
Julia Medina
cf064b1437 Move scrapy/contrib/linkextractors to scrapy/linkextractors 2015-04-29 21:24:30 -03:00