Ismael Carnales
ff88171b05
added items to tutorial
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40793
2009-01-29 14:35:54 +00:00
Ismael Carnales
04fc0471ec
first version of the new tutorial (work in progress)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40792
2009-01-29 12:29:28 +00:00
elpolilla
3ca2187e3b
Removed non-generic implementation of _add_single_attributes and modified test
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40791
2009-01-29 02:21:08 +00:00
elpolilla
bc4fcbeab7
Added response type recognition to local file urls
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40790
2009-01-28 11:45:39 +00:00
Andres Moreira
e9fbce4ee2
Improved code of utils.markup functions.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40789
2009-01-27 17:58:51 +00:00
elpolilla
c0601cd03e
Refactored remove_entities and fixed the hex entities bug (they werent recognised). Also, tests were added
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40788
2009-01-27 14:05:22 +00:00
Pablo Hoffman
c3d61dc999
changing extension in test from csv to txt (not all systems support the text/csv mimetype)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40787
2009-01-27 13:33:28 +00:00
Pablo Hoffman
134b867abe
added support for unknown file extensions to ResponseTypes.from_filename
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40786
2009-01-27 13:08:07 +00:00
elpolilla
14e67d59c2
Added test for encoding in csviter
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40785
2009-01-27 12:35:09 +00:00
Pablo Hoffman
1e55e88111
fixed some typos in previous commit
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40784
2009-01-27 12:18:23 +00:00
Pablo Hoffman
11a9236125
added example about creating Requests with cookies
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40783
2009-01-27 12:17:40 +00:00
Pablo Hoffman
14ceca98bf
added FormRequest example
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40782
2009-01-27 12:10:49 +00:00
Pablo Hoffman
e25bfa5d88
added more cases to ResponseTypes, and tests for ResponseTypes
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40781
2009-01-27 11:27:09 +00:00
elpolilla
b6246cbc37
Fixed encoding-related bug in csviter
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40780
2009-01-27 11:10:42 +00:00
Pablo Hoffman
4013497edb
more patches sent by Patrick
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40779
2009-01-26 23:42:51 +00:00
Pablo Hoffman
c1a1b8945a
some doc fixes suggested by Patrick Mézard
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40778
2009-01-26 23:38:21 +00:00
Pablo Hoffman
57189e1b92
applied Patrick Mézard patch for loading local files
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40777
2009-01-26 23:31:04 +00:00
Pablo Hoffman
ce1700dd8e
added libxml2 installation steps for MacOSX (thanks Patrick Mézard)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40776
2009-01-26 23:28:19 +00:00
Pablo Hoffman
279b9ac40d
updated adaptors docs to reflect its inestability and improved the ReST formatting (80 column lines, no ugly pipes at the beginnig of lines)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40775
2009-01-26 23:22:53 +00:00
elpolilla
e8e87bcb52
Fixed small bug in adaptor pipelines that tried to work with None values
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40774
2009-01-26 16:20:58 +00:00
Pablo Hoffman
7ed88fd0f3
added Content-Disposition encoding discovery to ResponseTypes
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40773
2009-01-26 15:36:53 +00:00
elpolilla
9de6ee5109
Added missing test for change in r769
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40772
2009-01-26 15:25:53 +00:00
elpolilla
91eff31f18
. Modified the default value of the BOT_NAME setting to the project's name
...
. Modified spider templates to use the already-generated example item instead of a ScrapedItem
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40771
2009-01-26 15:24:52 +00:00
elpolilla
f63a661320
Normalized the usage of ints for storing http status codes
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40770
2009-01-26 15:03:47 +00:00
elpolilla
74661d54d0
Added application/xml mimetype to the known response types
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40769
2009-01-26 15:03:14 +00:00
elpolilla
b9d3a2cb96
Improved adaptors debugging in order to make it clearer for reading
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40768
2009-01-26 12:15:31 +00:00
Pablo Hoffman
bc4e80f640
reverted to IO-blocking MailSender implementation (using standard smtplib) until we fix some problems with deferred left unexectuted when stopping the engine
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40767
2009-01-26 03:40:59 +00:00
Pablo Hoffman
5f0d5a1653
Big Response/Request refactoring:
...
- added Response subclasses: TextResponse, HtmlResponse, XmlResponse
- made Response.body a str
- added Response.body_as_unicode() method
- added encoding attribute for TextResponse and subclasses
- added headers_encoding() and body_encoding() to TextResponse and subclasses
- added ResponseTypes class to guess the Response class to use based on
mimetype and other criteria
- added and improved several Request/Response tests
- updated request/response documetnation to reflect the changes
Another changes not related to encoding:
- added lixbml2debug for debugging libxml2 memory leaks, which can be enabled
by a environment variable
- added memoizemethod decorator (implemented using descriptors) to cache the
result of methods
- moved DecompressionMiddleware to contrib_exp
--HG--
rename : scrapy/trunk/scrapy/tests/test_spiders/testplugin.py => scrapy/trunk/scrapy/tests/test_spiders/testspider.py
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40766
2009-01-26 02:57:03 +00:00
elpolilla
dbaf602730
. Updated some adaptors docstrings
...
. Turned Delist and Unquote adaptors into factory functions instead of classes
. Updated tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40765
2009-01-26 01:15:05 +00:00
Pablo Hoffman
33df3c81c2
explained what the heck is scrapy.contrib_exp
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40764
2009-01-24 16:59:08 +00:00
Daniel Grana
a07c95003f
duplicatesfilter: first version of configurable duplicate requests filtering middleware
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40763
2009-01-23 03:42:05 +00:00
elpolilla
9209dbc882
Fixed syntax error in test
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40762
2009-01-22 17:13:27 +00:00
elpolilla
f6021cad2c
Several modifications done:
...
. attribute method moved from ScrapedItem to RobustScrapedItem
. this method was also drastically changed. now it accepts *args and runs the adaptor pipeline for each value
. many adaptors were changed to work with single values instead of lists (due to the previous point's change)
. ExtractImageLinks adaptor was modified to make use of the HTMLImageLinkExtractor
. adaptors were moved from contrib to contrib_exp
. tests were updated
--HG--
rename : scrapy/trunk/scrapy/contrib/adaptors/misc.py => scrapy/trunk/scrapy/contrib_exp/adaptors/misc.py
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40761
2009-01-22 16:41:41 +00:00
elpolilla
4142fd0d5d
Added get_base_url to utils.response and tests
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40760
2009-01-22 16:40:01 +00:00
elpolilla
86923cc694
Changes in HTMLImageLinkExtractor:
...
. Fixed little bug that triggered IndexErrors in some cases
. Added support for receiving selectors instead of just raw xpath expressions
. Re-enabled tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40759
2009-01-22 16:39:31 +00:00
Daniel Grana
eebc070fe4
tutorial: fix reference to scrapy-ctl. contributed by Patrick Mezard <pmezard@gmail.com>
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40758
2009-01-22 14:28:20 +00:00
Daniel Grana
8ff4dc4d02
RetryMiddleware: added ConnectionLost to retried exceptions
...
twisted >8.0 has a ConnectionClosed exception parent of ConnectionLost
and ConnectionDone, but twisted 2.5 hasn't.
I add ConnectionLost until we can move forward to twisted >8.0
this is the docstring of ConnectionLost, hopes it is self explanatory:
"""Connection to the other side was lost in a non-clean fashion"""
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40757
2009-01-22 03:19:53 +00:00
Pablo Hoffman
bef1fa967a
removed Request.append_callback() method (it was just an alias to Request.deferred.addCallback). refs #48
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40756
2009-01-20 21:49:20 +00:00
Pablo Hoffman
677d0c366f
renamed Request url_encoding constructor argument to encoding. added Request.body tests
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40755
2009-01-20 21:10:18 +00:00
Pablo Hoffman
12d0bd4dbb
added Content-Length header population to Common downloader middleware
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40754
2009-01-20 21:00:56 +00:00
Ismael Carnales
1e002a0c98
make the install scrapy code steps a list, so it doesn't show as sepparate points in the doc overview (we need an style guiide)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40753
2009-01-19 14:01:57 +00:00
Ismael Carnales
ddf0366037
removed $ from commands in install it doesn't look so nice but it copy/paste compatible
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40752
2009-01-19 13:51:16 +00:00
Ismael Carnales
1cc95dc129
changed (and fixed) download links for windows libraries in install
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40751
2009-01-19 13:48:06 +00:00
Ismael Carnales
30e928568f
corrected arch linux install information
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40750
2009-01-19 13:37:07 +00:00
Pablo Hoffman
4018f07d17
minor update to topics/settings.rst
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40749
2009-01-19 03:14:23 +00:00
Daniel Grana
a29fed066c
docs: fix MailSender and Settings method references
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40748
2009-01-19 00:35:28 +00:00
Pablo Hoffman
3f13b388c7
added additional test to ResponseSoup extension
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40747
2009-01-18 19:31:35 +00:00
Pablo Hoffman
a413fc3bce
some minor performance improvements in downloader handlers, added scrapy.optional_features set
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40746
2009-01-18 19:20:32 +00:00
Pablo Hoffman
0c9f7257a2
added Request.replace method, improved tests for replace/copy method in Request/Response classes
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40745
2009-01-18 17:52:21 +00:00
Pablo Hoffman
314bbabb30
removed 'domain' from Request attributes and constructor arguments
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40744
2009-01-18 16:55:54 +00:00