elpolilla
9209dbc882
Fixed syntax error in test
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40762
2009-01-22 17:13:27 +00:00
elpolilla
f6021cad2c
Several modifications done:
...
. attribute method moved from ScrapedItem to RobustScrapedItem
. this method was also drastically changed. now it accepts *args and runs the adaptor pipeline for each value
. many adaptors were changed to work with single values instead of lists (due to the previous point's change)
. ExtractImageLinks adaptor was modified to make use of the HTMLImageLinkExtractor
. adaptors were moved from contrib to contrib_exp
. tests were updated
--HG--
rename : scrapy/trunk/scrapy/contrib/adaptors/misc.py => scrapy/trunk/scrapy/contrib_exp/adaptors/misc.py
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40761
2009-01-22 16:41:41 +00:00
elpolilla
4142fd0d5d
Added get_base_url to utils.response and tests
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40760
2009-01-22 16:40:01 +00:00
elpolilla
86923cc694
Changes in HTMLImageLinkExtractor:
...
. Fixed little bug that triggered IndexErrors in some cases
. Added support for receiving selectors instead of just raw xpath expressions
. Re-enabled tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40759
2009-01-22 16:39:31 +00:00
Daniel Grana
eebc070fe4
tutorial: fix reference to scrapy-ctl. contributed by Patrick Mezard <pmezard@gmail.com>
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40758
2009-01-22 14:28:20 +00:00
Daniel Grana
8ff4dc4d02
RetryMiddleware: added ConnectionLost to retried exceptions
...
twisted >8.0 has a ConnectionClosed exception parent of ConnectionLost
and ConnectionDone, but twisted 2.5 hasn't.
I add ConnectionLost until we can move forward to twisted >8.0
this is the docstring of ConnectionLost, hopes it is self explanatory:
"""Connection to the other side was lost in a non-clean fashion"""
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40757
2009-01-22 03:19:53 +00:00
Pablo Hoffman
bef1fa967a
removed Request.append_callback() method (it was just an alias to Request.deferred.addCallback). refs #48
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40756
2009-01-20 21:49:20 +00:00
Pablo Hoffman
677d0c366f
renamed Request url_encoding constructor argument to encoding. added Request.body tests
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40755
2009-01-20 21:10:18 +00:00
Pablo Hoffman
12d0bd4dbb
added Content-Length header population to Common downloader middleware
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40754
2009-01-20 21:00:56 +00:00
Ismael Carnales
1e002a0c98
make the install scrapy code steps a list, so it doesn't show as sepparate points in the doc overview (we need an style guiide)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40753
2009-01-19 14:01:57 +00:00
Ismael Carnales
ddf0366037
removed $ from commands in install it doesn't look so nice but it copy/paste compatible
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40752
2009-01-19 13:51:16 +00:00
Ismael Carnales
1cc95dc129
changed (and fixed) download links for windows libraries in install
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40751
2009-01-19 13:48:06 +00:00
Ismael Carnales
30e928568f
corrected arch linux install information
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40750
2009-01-19 13:37:07 +00:00
Pablo Hoffman
4018f07d17
minor update to topics/settings.rst
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40749
2009-01-19 03:14:23 +00:00
Daniel Grana
a29fed066c
docs: fix MailSender and Settings method references
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40748
2009-01-19 00:35:28 +00:00
Pablo Hoffman
3f13b388c7
added additional test to ResponseSoup extension
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40747
2009-01-18 19:31:35 +00:00
Pablo Hoffman
a413fc3bce
some minor performance improvements in downloader handlers, added scrapy.optional_features set
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40746
2009-01-18 19:20:32 +00:00
Pablo Hoffman
0c9f7257a2
added Request.replace method, improved tests for replace/copy method in Request/Response classes
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40745
2009-01-18 17:52:21 +00:00
Pablo Hoffman
314bbabb30
removed 'domain' from Request attributes and constructor arguments
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40744
2009-01-18 16:55:54 +00:00
Pablo Hoffman
d3c4d1f1e1
removed domain argument from Response constructor
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40743
2009-01-18 16:38:01 +00:00
Pablo Hoffman
db91d26871
removed 'domain' argument from Response objects constructor. besides being a required first constructor argument, it wasn't actually needed and made the Response consturctor more complex
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40742
2009-01-18 16:36:17 +00:00
Pablo Hoffman
654b49c86e
added meta argument to Request & Response constructors
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40741
2009-01-17 23:57:53 +00:00
Pablo Hoffman
8ecc6808e0
removed Request.context attribute (use Request.meta instead)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40740
2009-01-17 23:09:53 +00:00
Pablo Hoffman
7e640da433
renamed to_string() Request and Response methods to httprepr(). removed __len__() from Request and Response
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40739
2009-01-17 22:11:54 +00:00
Pablo Hoffman
5dc1e7e5ca
updated request/response reference doc
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40738
2009-01-17 21:05:08 +00:00
Pablo Hoffman
da6a24b662
More Request/Response cleanup:
...
* made status attribute an int
* made engine use __str__ to display crawled requests
* HTTP cache now inherits Response class to change __str__
* added tests to check that the class is preserved on .copy() (for both Requests and Responses)
* removed custom cached attribute (and passed to a Response.meta item)
* removed some custom (and seldom used) methods from Response class: version(), info()
* reinforced the privacy of the ResponseBody class, by renaming it to _ResponseBody and added a warning that it may be removed in the future
* added tests for Request & Response to_string() methods
* fixed minor (and harmless) bug in to_string() methods
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40737
2009-01-17 20:40:07 +00:00
Pablo Hoffman
b1745f49f1
removed deprecated original_url attribute from Response objects (it can be accessed through Response.request.url)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40736
2009-01-17 15:57:28 +00:00
Pablo Hoffman
7b545381bd
changed log message and increased log level, when spiders return objects which are not Request or ScrapedItem
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40735
2009-01-17 15:22:59 +00:00
Pablo Hoffman
6ba6238c83
Response class:
...
* added meta and cache attributes to Response class
* added tests for Response copy
Request class:
* added meta attribute and renamed old _cache attribute to cache
* moved depth and link_text to Request.meta
* added tests for Request copy
* ResponseLibxml2 and ResponseSoup extensions now use Response.cache
Updated doc with changes
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40734
2009-01-15 03:24:48 +00:00
Pablo Hoffman
d26a54f541
added tests for ResponseSoup and ResponseLibxml2 extensions
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40733
2009-01-15 03:06:00 +00:00
Pablo Hoffman
604af8e74f
doc; removed referer argument from Request constructor
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40732
2009-01-15 00:20:24 +00:00
Pablo Hoffman
2a7b41cdb2
removed referer argument from Request constructor. refs #48
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40731
2009-01-15 00:10:31 +00:00
Daniel Grana
9513b1f465
Remove response referneces from pipelines. refs #51
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40730
2009-01-14 23:59:45 +00:00
Pablo Hoffman
eef01a9fdd
removed Request.method magic in Request constructor. refs #48
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40729
2009-01-14 23:50:23 +00:00
Pablo Hoffman
ae95c1df68
removed unused (and broken) prepend_callback Request method
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40728
2009-01-14 23:31:24 +00:00
Pablo Hoffman
3f89fc10b7
shortened some line widths
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40727
2009-01-14 23:23:10 +00:00
Pablo Hoffman
1272b138ea
moved HTTP auth functionality out of Request class and into scrapy.utils.request.request_authenticate function, added tests
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40726
2009-01-14 22:02:58 +00:00
Andres Moreira
8fc4719d0c
Added dns cache support for the crawler, improving the performance of the page download because this reduce the dns lookups.
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40725
2009-01-14 18:59:52 +00:00
samus_
7be6ff0727
typo
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40724
2009-01-14 12:12:37 +00:00
samus_
baf9a8d846
renamed expiration setting to the same used by the image pipeline
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40723
2009-01-14 12:09:06 +00:00
Pablo Hoffman
0ea37f51db
* moved request fingerprinting from Request class to scrapy.utils.request - closes #50
...
* cleaned up fingerprint tests suite (only left relevant tests)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40722
2009-01-14 01:17:40 +00:00
Pablo Hoffman
519458bdae
added documentation for settings: ENGINE_DEBUG, DOWNLOADER_DEBUG
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40721
2009-01-14 00:19:22 +00:00
Pablo Hoffman
64d1f67c57
decreased logging level of RequestLimitMiddleware to DEBUG
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40720
2009-01-13 21:47:49 +00:00
Pablo Hoffman
4ed811b4d3
added DOWNLOAD_DELAY to default_settings and documentation, fixed some typo errors in settings reference
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40719
2009-01-13 14:43:38 +00:00
Pablo Hoffman
0e78003c92
removed my email from CLOSEDOMAIN_NOTIFY setting
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40718
2009-01-13 13:50:51 +00:00
Pablo Hoffman
e316722bb1
updated doc: ref/emails.rst and topics/downloader-middleware.rst
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40717
2009-01-13 11:55:20 +00:00
Pablo Hoffman
468bfeb278
removed unused imports
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40716
2009-01-13 10:10:19 +00:00
Pablo Hoffman
95d99d51b9
renamde old SchedulerStats web console module to ScheduleQueue and made it work with the new PriorityQueue
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40715
2009-01-13 02:49:50 +00:00
Pablo Hoffman
deb960526b
removed unused (Django) classes from scrapy.utils.datatypes: MergeDict, SortedDict, DotExpandedDict, FileDict. And also removed unused class gzStringIO
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40714
2009-01-13 01:46:45 +00:00
Pablo Hoffman
ff637d9a0a
added __len__ to PriorityQueue/Stack, and changed __iter__ implementation to return (item, priority) tuples, added more test cases
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40713
2009-01-13 01:37:49 +00:00