1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-28 17:57:55 +00:00

4157 Commits

Author SHA1 Message Date
Daniel Graña
958d0dbf39 move tests requirements to a single place 2014-01-03 18:55:54 -02:00
Pablo Hoffman
dadaedce3e Merge pull request #512 from yprez/patch-1
Item Pipeline docs - remove unused import from code sample
2014-01-03 06:49:55 -08:00
Yuri Prezument
060891c01c Remove unused import from code sample
Item pipeline docs - removed unused import from code sample
2014-01-03 15:44:17 +02:00
Mikhail Korobov
292d21ca86 spelling: instanciated -> instantiated 2014-01-01 11:52:07 +06:00
Mikhail Korobov
d7fbccdff7 Merge pull request #510 from dangra/kmike-reanme-base-spider
Rename BaseSpider to Spider. Fixes #495, fixes #501.
2013-12-31 21:47:26 -08:00
Pablo Hoffman
e2f091801c Merge pull request #507 from dangra/fromresponse_override_url
[MRG] support overriding url in FormRequest.from_response
2013-12-31 12:01:13 -08:00
Daniel Graña
b41ad38fab warn when the deprecated class is instanciated 2013-12-31 16:40:00 -02:00
Daniel Graña
8c4e1db5fe Hold the deprecated class reference in the metaclass 2013-12-31 16:02:03 -02:00
Daniel Graña
f6a0ac8325 support overriding url in FormRequest.from_response 2013-12-30 16:58:11 -02:00
Mikhail Korobov
5af45689e4 allow caller to customize clsdict 2013-12-31 00:17:42 +06:00
Mikhail Korobov
1e96063ffa simplify deprecation code (@dangra’s idea) 2013-12-30 23:57:27 +06:00
Mikhail Korobov
fa82cc6794 Merge pull request #506 from dangra/ahbeng-368-get-func-args-partial-support
get func args partial support - based on #504
2013-12-30 08:45:38 -08:00
Mikhail Korobov
39458b7843 Rename warn_when_subclassed to deprecated_base_class; add some magic for issubclass and isinstance checks to work if subclasses of non-deprecated class are checked against deprecated class. 2013-12-30 20:59:10 +06:00
Mikhail Korobov
04788673e7 improved deprecation code; add some test 2013-12-30 19:46:41 +06:00
Mikhail Korobov
a27d91f0a6 Rename BaseSpider to Spider. See GH-495. 2013-12-30 19:46:41 +06:00
Mikhail Korobov
439a141aae utility for showing a warning when a class is subclassed 2013-12-30 19:46:41 +06:00
Daniel Graña
74433a17e7 Merge pull request #505 from kmike/scrapy-shell-docs
[MRG] minor fixes to scrapy shell docs
2013-12-30 04:08:50 -08:00
Daniel Graña
22e107194b test loader processors using partial functions 2013-12-30 09:26:02 -02:00
Daniel Graña
589fd037d9 do not return applied arguments on partial functions 2013-12-30 09:25:11 -02:00
Mikhail Korobov
e713733edf minor fixes to scrapy shell docs
* better IPython links;
* MDC link instead of w3schools;
* small formatting fixes;
* show quoted URL in example
2013-12-30 10:27:39 +06:00
Daniel Graña
f9dd2986d4 Merge pull request #503 from kmike/better-tox-ini
[MRG] allow running individual tests via tox
2013-12-29 14:03:29 -08:00
Beng Hee Eu
09c3f53693 Fixes #368: Support functools.partial in scrapy.utils.python.get_func_args() 2013-12-30 03:46:10 +08:00
Mikhail Korobov
b15df73a0f allow running individual tests via tox, e.g. «tox -e trunk -- scrapy.tests.test_spider» 2013-12-29 01:59:20 +06:00
Mikhail Korobov
f18ac02987 remove duplicated link extractors link
Check http://doc.scrapy.org/en/latest/topics/link-extractors.html - two menu items are highlighted at the left.
2013-12-28 05:40:10 +05:00
Mikhail Korobov
9a999daa2a DOWNLOAD_DELAY docs clarification:
* delay is enforced per website, not per spider;
* document download_delay attribute (it was previously documented only in FAQ about 999 error codes);
* document how CONCURRENT_REQUESTS_PER_IP affects download delays.
2013-12-28 06:30:34 +06:00
Pablo Hoffman
99f02b11f5 Merge pull request #498 from mbacho/master
Removing extra 'doc' extension, adding 'pptx' and 'xlsx' extensions
2013-12-24 06:36:06 -08:00
Chomba Ng'ang'a
95dc46f2bc Removing extra 'doc' extension, adding 'pptx' and 'xlsx' extensions 2013-12-24 17:29:40 +03:00
Pablo Hoffman
e42e3743fe quick documentation for #475 2013-12-24 12:19:15 -02:00
Daniel Graña
c4cd0f45a0 reindent travis.yml as per travis gem defaults 2013-12-24 11:26:48 -02:00
Daniel Graña
e91025536f Merge pull request #475 from dangra/473-do-not-send-header
[MRG] Do not set Referer by default when its value is None
2013-12-24 05:20:36 -08:00
Daniel Graña
2b7fea26a5 Do not set Referer by default when its value is None
closes #473
2013-12-24 11:05:06 -02:00
Daniel Graña
2c2ce20878 Merge pull request #490 from max-arnold/master
add new pipeline methods to get file/image/thumbnail paths
2013-12-23 15:11:53 -08:00
Mikhail Korobov
e0cebbfc8f add a remark about 1% 2013-12-20 23:12:37 +06:00
Mikhail Korobov
ee46ec8920 kill AjaxCrawlableMiddlewar.enabled_setting 2013-12-20 19:08:49 +06:00
Pablo Hoffman
80bb9fb5d2 Merge pull request #497 from kmike/utils-unique
[MRG] no need to use dict in scrapy.utils.python.unique
2013-12-19 05:40:34 -08:00
Mikhail Korobov
84aa7599a4 no need to use dict in scrapy.utils.python.unique 2013-12-19 15:06:27 +06:00
Mikhail Korobov
93eee9d1c6 Merge pull request #479 from citizen-stig/unicode_spider_name
fix logging error with unicode spider name
2013-12-18 12:17:57 -08:00
Mikhail Korobov
943a0bd264 AjaxCrawlableMiddleware in Broad Crawl docs 2013-12-19 01:01:26 +06:00
Mikhail Korobov
84a3a9daac extra test 2013-12-19 00:35:46 +06:00
Mikhail Korobov
503ab58f59 Fail-fast path.
For me middleware now can process about 2-3k ajax crawlable pages/sec and 50k+ regular pages/sec (if they don’t contain «fragment» or «content» words).
2013-12-19 00:28:47 +06:00
Mikhail Korobov
71c59e1c94 add an undocumented setting for lookup body size 2013-12-19 00:23:38 +06:00
Mikhail Korobov
a87b3bd1c8 AjaxCrawlableMiddleware 2013-12-19 00:06:47 +06:00
Mikhail Korobov
51f340e1b9 add distutils folders to gitignore 2013-12-18 23:29:07 +06:00
Max Arnold
87559a1240 fix missing keyword arguments for file_path in media_downloaded() 2013-12-18 23:23:01 +07:00
Max Arnold
d8ca8f83bb test for new response and info keyword arguments 2013-12-18 23:09:37 +07:00
Max Arnold
2dfeff7eca rename ImagesPipelineTestCase image_path var to file_path for clarity 2013-12-18 22:54:47 +07:00
Max Arnold
6959523338 implement DeprecatedImagesPipelineTestCase 2013-12-18 22:54:39 +07:00
Max Arnold
65c017c5fe implement DeprecatedFilesPipelineTestCase 2013-12-18 22:25:42 +07:00
Max Arnold
14b68b47d6 rename FilesPipelineTestCase image_path var to file_path for clarity 2013-12-18 22:13:49 +07:00
Max Arnold
23c50bd973 rename key to path in FilesPipelineTestCase 2013-12-18 10:52:13 +07:00