1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-28 13:04:05 +00:00

3649 Commits

Author SHA1 Message Date
Daniel Graña
32b781a1b8 cookies: increase candidate list with dot prefixed domains 2013-04-24 16:22:40 -03:00
Daniel Graña
31bec08a39 rfc2965 is dead
- It is not enabled by default in python cookielib
- Mozilla rejected implementing it https://bugzilla.mozilla.org/show_bug.cgi?id=208985
- Netscape cookies still rules
- It was superseded by RFC6265 which is the facto protocol formalized
2013-04-24 15:07:40 -03:00
Daniel Graña
973906153c Merge pull request #77 from shane42/master
Cookie handling performance improvement
2013-04-24 10:59:36 -07:00
Pablo Hoffman
d02da2f31f ported code to use queuelib 2013-04-23 17:48:09 -03:00
Daniel Graña
5531290d53 Merge pull request #292 from nramirezuy/item-unusedimport
deleted unused import
2013-04-19 10:53:20 -07:00
Nicolás Ramírez
bc592e9958 deleted unused import 2013-04-19 14:51:59 -03:00
Pablo Hoffman
7a1536f76e Merge pull request #290 from nramirezuy/item-copy
added copy method to item
2013-04-19 09:27:44 -07:00
Nicolás Ramírez
6df274bba5 added copy method to item 2013-04-19 13:23:53 -03:00
Pablo Hoffman
98f89d1c3f Merge pull request #291 from nramirezuy/cmd-parse-pipelines
parse pipelines test fixed
2013-04-19 09:22:38 -07:00
Pablo Hoffman
626331b865 test_crawl: make mock server print a line when ready, and wait for that line to start tests, instead of waiting for an arbitrary time 2013-04-19 13:06:30 -03:00
Nicolás Ramírez
e0c88f2d93 test fixed 2013-04-19 12:38:20 -03:00
Daniel Graña
ce177fa5bc mock server is slow bringing up on busy builders 2013-04-19 14:31:27 +00:00
Pablo Hoffman
74e9aecc8d Merge pull request #288 from kmike/patch-1
Update faq.rst
2013-04-17 17:55:45 -07:00
Mikhail Korobov
b245d592aa Update faq.rst
spider.DOWNLOAD_DELAY is deprecated
2013-04-18 02:42:15 +06:00
Pablo Hoffman
9feb65865c Merge pull request #284 from nramirezuy/cmd-parse-pipelines
Command parse, --pipelines argument added
2013-04-09 06:26:24 -07:00
Pablo Hoffman
adf38a65e9 Merge pull request #283 from opyate/patch-1
Update overview.rst, Torrent referenced as TorrentItem in spider
2013-04-08 13:48:05 -07:00
Nicolás Ramírez
2b39527f72 pipelines argument added 2013-04-08 14:55:28 -03:00
Juan M Uys
4de3aa4932 Update overview.rst 2013-04-08 14:13:15 +02:00
Pablo Hoffman
96c2332e0e fix inaccurate downloader middleware documentation. refs #280 2013-04-02 11:35:32 -03:00
Pablo Hoffman
b0ea457c7c Merge pull request #277 from nramirezuy/cmd-parse-args
Spider Arguments support for parse command
2013-03-28 15:38:35 -07:00
Nicolás Ramírez
df19693ed2 Spider Arguments support for parse command and test 2013-03-28 16:49:06 -03:00
Steven Almeroth
70179c7c0c doc: remove trailing spaces 2013-03-21 13:57:39 -06:00
Steven Almeroth
0d7747d353 doc: Response.replace() cannot take meta argument
>>> response.replace(meta={'foo':1})
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/srv/scrapy/scrapy-fork/scrapy/scrapy/http/response/text.py", line 45, in replace
    return Response.replace(self, *args, **kwargs)
  File "/srv/scrapy/scrapy-fork/scrapy/scrapy/http/response/__init__.py", line 77, in replace
    return cls(*args, **kwargs)
  File "/srv/scrapy/scrapy-fork/scrapy/scrapy/http/response/text.py", line 22, in __init__
    super(TextResponse, self).__init__(*args, **kwargs)
TypeError: __init__() got an unexpected keyword argument 'meta'
2013-03-21 13:49:55 -06:00
Pablo Hoffman
21c8b89422 Revert "replaced use of depricated module scrapy.settings with the method get_project_settings()"
This reverts commit 1b4d14c8f635b28d5f72d37f924c4e71d71520ca.

Calling `get_project_settings()` generates a new independent Settings
object that doesn't contain the overrides passed by command line
arguments, for example.

Proper port would require implementing the from_crawler() class method
and making sure the settings object is passed to all internal objects
(probably breaking some minor backwards compatibility).
2013-03-21 12:27:07 -03:00
Pablo Hoffman
8c181d87f6 Merge pull request #274 from brunsgaard/master
Updated settings import in contrib/feedexport.py class S3FeedStorage
2013-03-21 08:25:29 -07:00
Jonas Brunsgaard
1b4d14c8f6 replaced use of depricated module scrapy.settings with the method get_project_settings() 2013-03-21 16:09:14 +01:00
Pablo Hoffman
d0a81d369f initial version of crawl tests using a mock HTTP server (in separate process). This can also be used to benchmark scrapy performance, although a script (specially suited for that task) would be more convenient 2013-03-20 14:48:59 -03:00
Pablo Hoffman
2a5c7ed4da make Crawler.start() return a deferred that is fired when the crawl is finished 2013-03-20 14:48:59 -03:00
Daniel Graña
c43931ea8c Use latest pyOpenSSL for all travis tests environments
SSLv2 was removed from OpenSSL 1.0 and above but it is still referenced
by pyOpenSSL < 0.13. Travis workers are precise hosts with OpenSSL 1.0
and pyOpenSSL 0.12 (!) with a debian patch to workaround this problem
that is not present in pyOpenSSL 0.12 shipped by PyPi.

Trying to install pyOpenSSL 0.10 or 0.12 from packages at PyPi under a
system with OpenSSL >= 1.0 will success but fails at import time with a
message similar to:

    ImportError: .../lib/python2.7/site-packages/OpenSSL/SSL.so: undefined symbol: SSLv2_method
2013-03-20 11:44:48 -03:00
Pablo Hoffman
9968f99e06 remove ssl from optional_features to simplify code, as it is now required. also deprecate optional_features set 2013-03-20 09:52:40 -03:00
Pablo Hoffman
320bdfe391 Merge pull request #269 from kalessin/settingdict
added support for explicitly interpret a setting value as dict
2013-03-19 11:10:54 -07:00
Martin Olveyra
bf480015f4 added support generic python literals in settings, and for explicitly
interpret a setting value as dict
2013-03-19 16:02:53 -02:00
Pablo Hoffman
d246b926bf Merge pull request #273 from plainas/master
Accept ajax requests from other hosts (CORS support)
2013-03-19 07:27:56 -07:00
Pedro
a80ed769d9 allow remote ajax requests to the webservice 2013-03-19 12:00:49 +01:00
Pablo Hoffman
b347c14b5f update engine status output on telnet console documentation 2013-03-18 19:12:12 -03:00
Pablo Hoffman
6f9f6f1f16 Merge pull request #271 from nramirezuy/scraper-6017
Slots removed, now Scraper can handle just one spider
2013-03-18 15:05:34 -07:00
Shane Evans
5c2a82f1f7 fix typo 2013-03-17 19:34:55 +00:00
Nicolás Ramírez
58975abeab Slots removed, now Scraper can handle just one spider 2013-03-15 11:11:51 -03:00
Pablo Hoffman
e630126b82 scrapy deploy: return non-zero exit code if deploy fails 2013-03-14 16:47:35 -03:00
Pablo Hoffman
bb20907254 minor updated to faq 2013-03-14 16:43:00 -03:00
Pablo Hoffman
098ccff862 added FAQ about error: "cannot import name crawler" 2013-03-14 12:57:59 -03:00
Pablo Hoffman
a862f23376 added Nicolas Ramirez to AUTHORS 2013-03-14 12:44:39 -03:00
Pablo Hoffman
46b305ef7d Merge pull request #267 from nramirezuy/formrequest
Override Request 'method' in FormRequest
2013-03-14 08:43:53 -07:00
Nicolás Ramírez
f043bba049 Override request method in FormRequest
Changes proposed in the comments
2013-03-14 10:39:17 -03:00
Pablo Hoffman
a2e9d031f9 removed duplicated test 2013-03-14 10:32:19 -03:00
Pablo Hoffman
8391b36251 minor updates to contributing doc 2013-03-13 03:24:25 -03:00
Pablo Hoffman
51c301b3a2 added link to python binary libs, for windows installation 2013-03-13 03:18:33 -03:00
Pablo Hoffman
8e72730792 Merge pull request #261 from stav/allowed_domains
allow spider allowed_domains to be set/tuple, #259
2013-03-12 20:44:51 -07:00
Pablo Hoffman
296db1dc09 log (just once) when duplicate requests are filtered out. closes #105, #249 2013-03-12 19:28:43 -03:00
Pablo Hoffman
a4507f62c3 sep-019: fixed typo 2013-03-12 18:44:09 -03:00