- added Response subclasses: TextResponse, HtmlResponse, XmlResponse
- made Response.body a str
- added Response.body_as_unicode() method
- added encoding attribute for TextResponse and subclasses
- added headers_encoding() and body_encoding() to TextResponse and subclasses
- added ResponseTypes class to guess the Response class to use based on
mimetype and other criteria
- added and improved several Request/Response tests
- updated request/response documetnation to reflect the changes
Another changes not related to encoding:
- added lixbml2debug for debugging libxml2 memory leaks, which can be enabled
by a environment variable
- added memoizemethod decorator (implemented using descriptors) to cache the
result of methods
- moved DecompressionMiddleware to contrib_exp
--HG--
rename : scrapy/trunk/scrapy/tests/test_spiders/testplugin.py => scrapy/trunk/scrapy/tests/test_spiders/testspider.py
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40766
. attribute method moved from ScrapedItem to RobustScrapedItem
. this method was also drastically changed. now it accepts *args and runs the adaptor pipeline for each value
. many adaptors were changed to work with single values instead of lists (due to the previous point's change)
. ExtractImageLinks adaptor was modified to make use of the HTMLImageLinkExtractor
. adaptors were moved from contrib to contrib_exp
. tests were updated
--HG--
rename : scrapy/trunk/scrapy/contrib/adaptors/misc.py => scrapy/trunk/scrapy/contrib_exp/adaptors/misc.py
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40761
. Fixed little bug that triggered IndexErrors in some cases
. Added support for receiving selectors instead of just raw xpath expressions
. Re-enabled tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40759
twisted >8.0 has a ConnectionClosed exception parent of ConnectionLost
and ConnectionDone, but twisted 2.5 hasn't.
I add ConnectionLost until we can move forward to twisted >8.0
this is the docstring of ConnectionLost, hopes it is self explanatory:
"""Connection to the other side was lost in a non-clean fashion"""
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40757
* made status attribute an int
* made engine use __str__ to display crawled requests
* HTTP cache now inherits Response class to change __str__
* added tests to check that the class is preserved on .copy() (for both Requests and Responses)
* removed custom cached attribute (and passed to a Response.meta item)
* removed some custom (and seldom used) methods from Response class: version(), info()
* reinforced the privacy of the ResponseBody class, by renaming it to _ResponseBody and added a warning that it may be removed in the future
* added tests for Request & Response to_string() methods
* fixed minor (and harmless) bug in to_string() methods
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40737
* added meta and cache attributes to Response class
* added tests for Response copy
Request class:
* added meta attribute and renamed old _cache attribute to cache
* moved depth and link_text to Request.meta
* added tests for Request copy
* ResponseLibxml2 and ResponseSoup extensions now use Response.cache
Updated doc with changes
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40734