1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-28 05:43:56 +00:00

4128 Commits

Author SHA1 Message Date
Nikita Nikishin
a676017fe4 Fixed #441. 2014-02-18 03:50:43 -05:00
Paul Tremberth
66fb1738a2 Remove _log_level attribute as per comments 2014-02-17 13:42:42 +01:00
Paul Tremberth
41765ca18d DupeFilter: add setting for verbose logging + stats counter for filtered requests 2014-02-17 13:42:42 +01:00
Mikhail Korobov
42dc34f924 TST Improved twisted installation in tox.ini for Python 3.3
Install twisted trunk because released twisted doesn't install properly,
see https://twistedmatrix.com/trac/ticket/6539.

Also, test-requirements are removed for 3.3 because mitmproxy doesn't
install in Python 3.3, and mock is in Python 3.3 standard library.

Note that trial scrapy is not doing what is expected because twisted doesn't
have trial runner for 3.3 yet, so this command will most likely call "trial"
outside tox virtualenv and will use incorrect interpreter.
2014-02-17 17:21:38 +06:00
Pablo Hoffman
d24212c8e5 remove references to deprecated scrapy-developers list 2014-02-16 21:44:49 -02:00
Pablo Hoffman
fcbd7b4901 Merge pull request #596 from darkrho/doc-pipeline
DOC Use pipelines module name instead of pipieline following default project files.
2014-02-16 14:19:52 -08:00
Rolando Espinoza
28f946b05f DOC Use pipelines module name instead of pipieline following default project files. 2014-02-15 11:01:26 -04:00
Daniel Graña
180fc98cd8 Add 0.22.2 release notes 2014-02-14 15:45:44 -02:00
Daniel Graña
6ca49ce76a Merge pull request #594 from dangra/593-engineslots
fix a reference to unexistent engine.slots.
2014-02-14 09:39:30 -08:00
Daniel Graña
b58285b635 fix a reference to unexistent engine.slots. closes #593 2014-02-14 15:31:05 -02:00
Pablo Hoffman
c886d7459f add SEP-021 (Add-ons) - work in progress 2014-02-11 20:16:09 -02:00
Mikhail Korobov
9caf37553e Merge pull request #590 from Digenis/master
downloaderMW doc typo (spiderMW doc copy remnant)
2014-02-12 02:49:30 +05:00
Nikolaos-Digenis Karagiannis
43a797e2f7 downloaderMW doc typo (spiderMW doc copy remnant) 2014-02-11 22:30:00 +02:00
tracicot
b2f4b296df Correct typos 2014-02-10 11:46:23 -02:00
Daniel Graña
783d8e3ae1 Add 0.22.1 release notes 2014-02-08 17:09:04 -02:00
Daniel Graña
877e345d79 localhost666 can resolve under certain circumstances
localhost666 is resolvable if the host running the tests contains a
"search mydomain.com" in /etc/resolv.conf and a wildcard dns entry
exists for *.mydomain.com
2014-02-08 16:38:22 -02:00
Daniel Graña
45c9dab15b Merge pull request #582 from kmike/inspect-workaround
[WIP] Handle cases when inspect.stack() fails
2014-02-06 19:41:32 -08:00
Daniel Graña
9e95b36a09 test inspect.stack failure 2014-02-07 01:36:43 -02:00
Daniel Graña
c12e3a58b4 Merge pull request #585 from kmike/mitmproxy-reqs-cleanup
[MRG] testing PIL dependency is removed because there is a new mitmproxy version
2014-02-05 14:39:56 -08:00
Daniel Graña
3d4fe60e47 Merge pull request #584 from dangra/581-deprecated-subclass-fix
Fix wrong checks on subclassing of deprecated classes
2014-02-05 13:48:40 -08:00
Daniel Graña
ee896b154c Fix wrong checks on subclassing of deprecated classes. closes #581 2014-02-05 19:47:04 -02:00
Mikhail Korobov
8e88fbc255 testing PIL dependency is removed because there is a new mitmproxy version 2014-02-06 00:49:26 +06:00
Mikhail Korobov
8a1905e6ee Handle cases when inspect.stack() fails 2014-02-05 02:28:51 +06:00
Daniel Graña
45417e6b03 Merge pull request #574 from redapple/htmllx-tests
Fix HtmlParserLinkExtractor and tests after #485 merge
2014-02-02 19:09:02 -08:00
Mikhail Korobov
b74e7dd30a Merge pull request #575 from redapple/docs-tutorialindent
Docs: 4-space indent for final spider example
2014-02-01 15:07:01 -08:00
Paul Tremberth
57f30bcb04 Docs: 4-space indent for final spider example 2014-02-01 23:34:55 +01:00
Paul Tremberth
8017d85d8a Fix HtmlParserLinkExtractor and tests after #485 merge 2014-02-01 22:52:56 +01:00
Pablo Hoffman
e31fb49320 make 'basic' the default template spider in genspider, and added info with next steps to startproject. closes #488 2014-02-01 17:38:31 -02:00
Pablo Hoffman
c928eef585 Merge pull request #485 from bblanchon/sgml-extractor-link-text
SgmlLinkExtractor: Fixed link text when there is an inner tag
2014-02-01 10:28:56 -08:00
Paul Tremberth
38a96fba2d CrawSpider: support process_links as generator
Adding tests for CrawlSpider's process_links
2014-01-31 21:25:05 +01:00
Daniel Graña
e1c6d3ffdf Merge pull request #566 from redapple/offsite-stats
OffsiteMiddleware: add 2 stats counters
2014-01-31 10:05:35 -08:00
Daniel Graña
1e553ec0cd Merge pull request #570 from redapple/travistests
[MRG] Fix tests for Travis-CI build
2014-01-31 09:16:54 -08:00
Paul Tremberth
cc157f182b Fix tests for Travis-CI build 2014-01-31 01:27:54 +01:00
Mikhail Korobov
05158803df Merge pull request #571 from darkrho/httpcache-storage-doc
DOC Fixed HTTPCACHE_STORAGE typo in the default value which is now Files...
2014-01-30 08:48:39 -08:00
Rolando Espinoza
a6279fe95b DOC Fixed HTTPCACHE_STORAGE typo in the default value which is now Filesystem instead Dbm. 2014-01-30 11:53:42 -04:00
Paul Tremberth
fd5b40593a Always enable offsite stats + refactor test to initialize crawler 2014-01-29 17:37:24 +01:00
Paul Tremberth
1a545157c6 Offsite: add 2 stats counters 2014-01-29 17:37:24 +01:00
Pablo Hoffman
21f0e4048b Merge pull request #565 from dangra/562-sgmlinkextractor
replace unencodeable codepoints with html entities
2014-01-27 22:01:39 -08:00
Pablo Hoffman
116a1df4b0 Merge pull request #557 from darkrho/expose-crawler-in-shell
Expose current crawler in the scrapy shell.
2014-01-27 21:08:02 -08:00
Daniel Graña
66829c962f replace unencodeable codepoints with html entities. fixes #562 and #285 2014-01-27 11:37:09 -02:00
Daniel Graña
8ecf0b786d Merge pull request #556 from darkrho/item-loader-nones
Make `ItemLoader` ignore `None` values from processors.
2014-01-27 04:21:09 -08:00
Daniel Graña
b14dabb281 Merge pull request #561 from redapple/regexlx-encoding
RegexLinkExtractor: encode URL unicode value when creating Links
2014-01-24 07:33:23 -08:00
Rolando Espinoza
4255e12bc7 Updated the tutorial crawl output with latest output. 2014-01-23 18:18:56 -04:00
Rolando Espinoza
9aab9224cb Updated shell docs with the crawler reference and fixed the actual shell output.
Also updated the shell example with a reproducible code example.
2014-01-23 18:04:57 -04:00
Paul Tremberth
f87859a627 RegexLinkExtractor: encode URL unicode value when creating Links 2014-01-23 19:41:18 +01:00
Rolando Espinoza
4081ba238d PEP8 minor edits. 2014-01-23 11:13:54 -04:00
Rolando Espinoza
240fdde667 Expose current crawler in the scrapy shell. 2014-01-23 11:08:56 -04:00
Rolando Espinoza
b93412059d Unused re import and PEP8 minor edits. 2014-01-23 10:37:33 -04:00
Rolando Espinoza
420efe77b2 Ignore None's values when using the ItemLoader. 2014-01-23 10:36:06 -04:00
Pablo Hoffman
2d60f86084 Merge pull request #535 from redapple/xpath-smartstrings
Disable smart strings in lxml XPath evaluations
2014-01-22 07:41:31 -08:00