Pablo Hoffman
75cf903e24
adapted project template to use the new Link Extractors location
2009-07-28 12:27:25 -03:00
Pablo Hoffman
64d9155572
CloseDomain extension: fixed bug on domain close when not using CLOSEDOMAIN_TIMEOUT
2009-07-28 12:23:13 -03:00
Pablo Hoffman
fcc91901eb
finished cleaning up closedomain documentation, and updated default settings
2009-07-27 15:42:35 -03:00
Pablo Hoffman
09ba6927d7
Some changes to CloseDomain extension:
...
- added support for closing by item passed count (CLOSEDOMAIN_ITEMPASSED)
- removed support for sending notification emails (since that's the job of
another extension)
2009-07-27 15:23:50 -03:00
Pablo Hoffman
9da66698f3
moved httprepr() method (from Request and Response objects) to scrapy.utils functions
2009-07-25 18:56:12 -03:00
Daniel Grana
bed3c38014
fix delayedclosedomain extension bug due to changing lastseen from datetime to time.time
2009-07-25 15:22:15 -03:00
Daniel Grana
fdcdc307da
ignore docs/build
2009-07-25 15:21:22 -03:00
Pablo Hoffman
c615ace6bd
doc: updated google directory links in firebug guide
2009-07-24 13:14:36 -03:00
Daniel Grana
96c3cdbec2
improve OffsiteMiddleware reference docs
...
--HG--
extra : rebase_source : 3ed3f23fc1ec63b521ead029c5749898f3ab05d7
2009-07-24 12:59:38 -03:00
Pablo Hoffman
d21a22eab5
fixed stats collector bug which wasn't throwing the stats_domain_closing signal (on subclasses) before the persisting stage
2009-07-23 13:03:25 -03:00
Pablo Hoffman
38c3f7d0b4
Some changes to logging of scraped items:
...
1. "Scraped Item" log level changed to DEBUG
2. "Dropped Item" log level changed to WARNING
3. added "Passed Item" log message with INFO level
2009-07-23 11:49:48 -03:00
Pablo Hoffman
e43e28bf1d
minimal doc improvement
2009-07-23 09:12:49 -03:00
Ismael Carnales
6d24ae5920
added reference to working with relative xpaths in the tutorial
2009-07-23 09:05:14 -03:00
Pablo Hoffman
9baa6bb2a8
minor selectors doc fix
2009-07-23 01:56:56 -03:00
Ismael Carnales
6eb2609d1e
removed old serializators from newitem
2009-07-22 15:34:56 -03:00
Ismael Carnales
3c4afb23be
moved newitem from scrapy.contrib_exp to scrapy.newitem
2009-07-22 15:13:36 -03:00
Ismael Carnales
eb3a1dc16c
return default values for newitem in __getitem__
2009-07-22 15:13:33 -03:00
Ismael Carnales
202894dd8f
fixes to newitem doc
2009-07-22 10:25:22 -03:00
Pablo Hoffman
7d8ba0542c
fixed typo in doc
2009-07-21 17:38:46 -03:00
Pablo Hoffman
aa345e116a
Added spider middleware documentation
2009-07-21 17:19:19 -03:00
Ismael Carnales
90f1d9e489
Added Scheduler middleware reference documentation
2009-07-21 16:52:27 -03:00
Ismael Carnales
7afc915717
added BaseItem as base item class, and moved extra functionality of ScrapedItem to RobustScrapedItem
2009-07-21 16:11:21 -03:00
Ismael Carnales
00d9ce6608
fixes to newitem doc
2009-07-21 12:22:49 -03:00
Ismael Carnales
894a5d81d3
changed the newitem API to a dict-like interface
2009-07-21 12:20:49 -03:00
Pablo Hoffman
6fcbd03bff
removed obsolete code
2009-07-21 11:48:31 -03:00
Pablo Hoffman
4a10e1dabd
some cleanup to genspider command, disabled log and fixed inconsistencies reported in #55
2009-07-21 11:31:14 -03:00
Pablo Hoffman
29c92fe818
applied patch contributed by Manuel Aristaran for adding non-standard port support to mysql uri parsing in mysql_connect. closes #82
2009-07-21 01:42:44 -03:00
Pablo Hoffman
9867e7b2d6
removed references to obsolete ENABLED_SPIDERS_FILE setting
2009-07-21 01:11:48 -03:00
Pablo Hoffman
b9e3f72bee
Fixed encoding bug in tricky Response cloning case (reported in #90 ) and added unittests.
2009-07-20 11:44:12 -03:00
Ismael Carnales
cc33ac4f86
moved item.fields to item._fields in newitem
2009-07-16 13:00:25 -03:00
Pablo Hoffman
92b746d866
SimpledbStatsCollector: moved domain creation to constructor
2009-07-17 12:49:53 -03:00
Pablo Hoffman
97a854a53a
some cleanup to googledir example project
2009-07-17 08:57:45 -03:00
Pablo Hoffman
292757f312
added link to architecture overview and fixed old link
2009-07-16 19:15:19 -03:00
Pablo Hoffman
136014d718
Added section about relative xpaths to XPathSelectors doc
2009-07-16 17:29:29 -03:00
Pablo Hoffman
0fd603e27e
Automated merge with http://hg.scrapy.org/users/ismael/scrapy-newitem/
2009-07-16 11:16:46 -03:00
Pablo Hoffman
c2f12327d6
added dumping of global scrapy stats at engine shutdown
2009-07-16 10:48:10 -03:00
Ismael Carnales
a9069e8eaa
reorganized experimental doc in topics and ref, slitted long documents, introduced minor changes
2009-07-16 09:58:44 -03:00
Pablo Hoffman
a4a1c4b130
fixed and improved formatting of StatsMailer extension
2009-07-16 09:46:30 -03:00
Daniel Grana
9006d1f040
remove already deleted webconsole extension from default settings
2009-07-16 01:34:51 -03:00
Pablo Hoffman
7deb9d9b0b
removed unused scrapy.utils.datatypes.Sitemap class
2009-07-15 22:18:19 -03:00
Pablo Hoffman
6648b5da8b
updated logging of report in memdebug extension
2009-07-15 22:15:11 -03:00
Pablo Hoffman
1125825996
doc: minor updates to tutorial
2009-07-15 22:10:00 -03:00
Pablo Hoffman
db6afcf3cb
updated scrapy command line help
2009-07-15 22:09:26 -03:00
Pablo Hoffman
5faf725c53
updated parse command help
2009-07-15 22:05:20 -03:00
Pablo Hoffman
9a800b5aba
removed obsolete log command
2009-07-15 21:55:39 -03:00
Pablo Hoffman
fc2c6f99b4
updated help of some commands
2009-07-15 21:34:56 -03:00
Pablo Hoffman
fd4aed7d7b
renamed download command to fetch
...
--HG--
rename : scrapy/command/commands/download.py => scrapy/command/commands/fetch.py
2009-07-15 21:34:12 -03:00
Pablo Hoffman
a24516ed19
updated list command to show only spider domain names
2009-07-15 21:18:58 -03:00
Pablo Hoffman
5582305648
renamed old STATS_DEBUG setting to STATS_DUMP
2009-07-15 21:02:51 -03:00
Ismael Carnales
996fdb7edf
applied tn patch: check for project_name in scrapy-admin to be a valid module name, closes #92
2009-07-15 09:31:16 -03:00