1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-26 05:23:47 +00:00

1600 Commits

Author SHA1 Message Date
Pablo Hoffman
75cf903e24 adapted project template to use the new Link Extractors location 2009-07-28 12:27:25 -03:00
Pablo Hoffman
64d9155572 CloseDomain extension: fixed bug on domain close when not using CLOSEDOMAIN_TIMEOUT 2009-07-28 12:23:13 -03:00
Pablo Hoffman
fcc91901eb finished cleaning up closedomain documentation, and updated default settings 2009-07-27 15:42:35 -03:00
Pablo Hoffman
09ba6927d7 Some changes to CloseDomain extension:
- added support for closing by item passed count (CLOSEDOMAIN_ITEMPASSED)
- removed support for sending notification emails (since that's the job of
  another extension)
2009-07-27 15:23:50 -03:00
Pablo Hoffman
9da66698f3 moved httprepr() method (from Request and Response objects) to scrapy.utils functions 2009-07-25 18:56:12 -03:00
Daniel Grana
bed3c38014 fix delayedclosedomain extension bug due to changing lastseen from datetime to time.time 2009-07-25 15:22:15 -03:00
Daniel Grana
fdcdc307da ignore docs/build 2009-07-25 15:21:22 -03:00
Pablo Hoffman
c615ace6bd doc: updated google directory links in firebug guide 2009-07-24 13:14:36 -03:00
Daniel Grana
96c3cdbec2 improve OffsiteMiddleware reference docs
--HG--
extra : rebase_source : 3ed3f23fc1ec63b521ead029c5749898f3ab05d7
2009-07-24 12:59:38 -03:00
Pablo Hoffman
d21a22eab5 fixed stats collector bug which wasn't throwing the stats_domain_closing signal (on subclasses) before the persisting stage 2009-07-23 13:03:25 -03:00
Pablo Hoffman
38c3f7d0b4 Some changes to logging of scraped items:
1. "Scraped Item" log level changed to DEBUG
2. "Dropped Item" log level changed to WARNING
3. added "Passed Item" log message with INFO level
2009-07-23 11:49:48 -03:00
Pablo Hoffman
e43e28bf1d minimal doc improvement 2009-07-23 09:12:49 -03:00
Ismael Carnales
6d24ae5920 added reference to working with relative xpaths in the tutorial 2009-07-23 09:05:14 -03:00
Pablo Hoffman
9baa6bb2a8 minor selectors doc fix 2009-07-23 01:56:56 -03:00
Ismael Carnales
6eb2609d1e removed old serializators from newitem 2009-07-22 15:34:56 -03:00
Ismael Carnales
3c4afb23be moved newitem from scrapy.contrib_exp to scrapy.newitem 2009-07-22 15:13:36 -03:00
Ismael Carnales
eb3a1dc16c return default values for newitem in __getitem__ 2009-07-22 15:13:33 -03:00
Ismael Carnales
202894dd8f fixes to newitem doc 2009-07-22 10:25:22 -03:00
Pablo Hoffman
7d8ba0542c fixed typo in doc 2009-07-21 17:38:46 -03:00
Pablo Hoffman
aa345e116a Added spider middleware documentation 2009-07-21 17:19:19 -03:00
Ismael Carnales
90f1d9e489 Added Scheduler middleware reference documentation 2009-07-21 16:52:27 -03:00
Ismael Carnales
7afc915717 added BaseItem as base item class, and moved extra functionality of ScrapedItem to RobustScrapedItem 2009-07-21 16:11:21 -03:00
Ismael Carnales
00d9ce6608 fixes to newitem doc 2009-07-21 12:22:49 -03:00
Ismael Carnales
894a5d81d3 changed the newitem API to a dict-like interface 2009-07-21 12:20:49 -03:00
Pablo Hoffman
6fcbd03bff removed obsolete code 2009-07-21 11:48:31 -03:00
Pablo Hoffman
4a10e1dabd some cleanup to genspider command, disabled log and fixed inconsistencies reported in #55 2009-07-21 11:31:14 -03:00
Pablo Hoffman
29c92fe818 applied patch contributed by Manuel Aristaran for adding non-standard port support to mysql uri parsing in mysql_connect. closes #82 2009-07-21 01:42:44 -03:00
Pablo Hoffman
9867e7b2d6 removed references to obsolete ENABLED_SPIDERS_FILE setting 2009-07-21 01:11:48 -03:00
Pablo Hoffman
b9e3f72bee Fixed encoding bug in tricky Response cloning case (reported in #90) and added unittests. 2009-07-20 11:44:12 -03:00
Ismael Carnales
cc33ac4f86 moved item.fields to item._fields in newitem 2009-07-16 13:00:25 -03:00
Pablo Hoffman
92b746d866 SimpledbStatsCollector: moved domain creation to constructor 2009-07-17 12:49:53 -03:00
Pablo Hoffman
97a854a53a some cleanup to googledir example project 2009-07-17 08:57:45 -03:00
Pablo Hoffman
292757f312 added link to architecture overview and fixed old link 2009-07-16 19:15:19 -03:00
Pablo Hoffman
136014d718 Added section about relative xpaths to XPathSelectors doc 2009-07-16 17:29:29 -03:00
Pablo Hoffman
0fd603e27e Automated merge with http://hg.scrapy.org/users/ismael/scrapy-newitem/ 2009-07-16 11:16:46 -03:00
Pablo Hoffman
c2f12327d6 added dumping of global scrapy stats at engine shutdown 2009-07-16 10:48:10 -03:00
Ismael Carnales
a9069e8eaa reorganized experimental doc in topics and ref, slitted long documents, introduced minor changes 2009-07-16 09:58:44 -03:00
Pablo Hoffman
a4a1c4b130 fixed and improved formatting of StatsMailer extension 2009-07-16 09:46:30 -03:00
Daniel Grana
9006d1f040 remove already deleted webconsole extension from default settings 2009-07-16 01:34:51 -03:00
Pablo Hoffman
7deb9d9b0b removed unused scrapy.utils.datatypes.Sitemap class 2009-07-15 22:18:19 -03:00
Pablo Hoffman
6648b5da8b updated logging of report in memdebug extension 2009-07-15 22:15:11 -03:00
Pablo Hoffman
1125825996 doc: minor updates to tutorial 2009-07-15 22:10:00 -03:00
Pablo Hoffman
db6afcf3cb updated scrapy command line help 2009-07-15 22:09:26 -03:00
Pablo Hoffman
5faf725c53 updated parse command help 2009-07-15 22:05:20 -03:00
Pablo Hoffman
9a800b5aba removed obsolete log command 2009-07-15 21:55:39 -03:00
Pablo Hoffman
fc2c6f99b4 updated help of some commands 2009-07-15 21:34:56 -03:00
Pablo Hoffman
fd4aed7d7b renamed download command to fetch
--HG--
rename : scrapy/command/commands/download.py => scrapy/command/commands/fetch.py
2009-07-15 21:34:12 -03:00
Pablo Hoffman
a24516ed19 updated list command to show only spider domain names 2009-07-15 21:18:58 -03:00
Pablo Hoffman
5582305648 renamed old STATS_DEBUG setting to STATS_DUMP 2009-07-15 21:02:51 -03:00
Ismael Carnales
996fdb7edf applied tn patch: check for project_name in scrapy-admin to be a valid module name, closes #92 2009-07-15 09:31:16 -03:00