Pablo Hoffman
7fb889349e
fixed link to experimental doc
2009-07-14 08:46:15 -03:00
Pablo Hoffman
f1dc4e4af2
corrected TextField examples
2009-07-14 08:39:38 -03:00
Daniel Grana
bffcd45361
Automated merge with ssh://hg.scrapy.org/scrapy
2009-07-14 02:29:59 -03:00
Daniel Grana
95a0b1c614
split next_request method to allow calling it even when backout is needed
2009-07-14 02:29:43 -03:00
Pablo Hoffman
ae82f8c70d
made BaseField.to_python() raise NotImplementedError (already documented) and adapted unittest
2009-07-13 22:19:25 -03:00
Pablo Hoffman
a8aae41a13
removed unused files
2009-07-13 22:16:43 -03:00
Pablo Hoffman
5c39d173a5
merged proposed and experimental documentation, as it didn't make sense to keep two separate sections
...
--HG--
rename : docs/proposed/_images/scrapy_architecture.odg => docs/experimental/_images/scrapy_architecture.odg
rename : docs/proposed/_images/scrapy_architecture.png => docs/experimental/_images/scrapy_architecture.png
rename : docs/proposed/index.rst => docs/experimental/index.rst
rename : docs/proposed/newitem-fields.rst => docs/experimental/newitem-fields.rst
rename : docs/proposed/newitem.rst => docs/experimental/newitem.rst
2009-07-13 22:15:54 -03:00
Pablo Hoffman
26bb8ef608
doc: improved newitem fields reference
2009-07-13 22:05:48 -03:00
Pablo Hoffman
2bf39b7cdb
minor layout cleanups to newitem doc
2009-07-13 22:05:18 -03:00
Pablo Hoffman
74fcfc2cfc
deprecated old adaptors documentation
2009-07-13 22:03:56 -03:00
Pablo Hoffman
c73ff8198b
newitem fields: dropped support in to_python() for converting from None for default value, improved raising of TypeError instead of ValueError when appropiate, added and adapted unittests
2009-07-13 21:10:29 -03:00
Ismael Carnales
72457c3e4e
better handling of default value in newitem
2009-07-13 17:03:38 -03:00
Ismael Carnales
47d937f36b
only accept unicode strings in text fields
2009-07-13 15:54:48 -03:00
Ismael Carnales
d75afaa161
renamed StringField to TextField
2009-07-13 15:54:46 -03:00
Pablo Hoffman
8634a0d181
more efficient Item implementation and added support for using custom methods (unittests included)
2009-07-13 14:00:41 -03:00
Pablo Hoffman
dff510384b
doc: updated SCHEDULER_MIDDLEWARES_BASE setting
2009-07-13 13:33:47 -03:00
Ismael Carnales
b44409a203
added TimeField to newitem
2009-07-13 10:31:32 -03:00
Pablo Hoffman
5eebe1f405
fixed bug in fetcher caused by recent spider manager changes (thanks andres)
2009-07-13 00:04:00 -03:00
Pablo Hoffman
e3fe0ef297
Some changes to newitem API and implementation:
...
- Dropped support for wildcard importing from newitem package (must now import
from newitem.fields and don't use wildcard)
- Removed assign() method from Fields as it was apparently redundant (with
to_python() method) and I couldn't find any reason for keeping it (neither in
the docs nor in the tests)
- Moved deiter() method of Field to StringField, as its both its purpose and
implementation was specific for strings. if it's really needed as a general
purpose method, it could be restored. Also, no unittest was broken because of
this change, which sort-of reinforces my point.
- Renamed (previously mentioned) StringField.deiter() method to
StringField.to_single(), for better consistency with to_python() method
- Removed Field class as it was useless without the deiter() functionality (now
belonging to StringField class)
- Moved ansi_date_re module variable to DateField class attribute
- Simplified implementation of DecimalField, FloatField and IntegerField to one
line of code (using tests to make sure not to break any functionality)
- Renamed ItemMeta class (in models.py) to _ItemMeta to highlight its protected
state (should not be externally imported)
- Added support for instantiating new items with dicts, to support
deserializing items with their repr() string
- Added unittests for new functionality introduced
2009-07-11 22:19:56 -03:00
Pablo Hoffman
5054b67a02
improved newitems doc and marked robust scraped items as deprecated
2009-07-11 21:26:52 -03:00
Pablo Hoffman
1a153d47f3
improved invalid xpath exception message in xpath selectors, and added unittests
2009-07-11 17:19:20 -03:00
Pablo Hoffman
fc64360a34
removed unused lines
2009-07-11 16:42:32 -03:00
Pablo Hoffman
e3caf00d7a
simplified implementation of spider manager by removing knowledge of enabled spiders
2009-07-10 16:41:02 -03:00
dgrana
4cd1fa9c32
generate dropin.cache for spiders under tests
2009-07-10 05:29:27 +01:00
Pablo Hoffman
9270810840
improved usage of urljoin_rfc function, adding unittests and encoding where needed
2009-07-09 18:45:40 -03:00
Daniel Grana
d5d2c5c924
update documentation to recent pydispatcher import path change
2009-07-09 17:13:30 -03:00
Daniel Grana
18fbd7c7eb
Automated merge with ssh://hg.scrapy.org/scrapy
2009-07-09 16:58:07 -03:00
Daniel Grana
eff8ea6173
remove response from item_passed and item_dropped signal api
2009-07-09 16:57:03 -03:00
Pablo Hoffman
5da32d9f6d
fixed Sphinx warning
2009-07-09 16:50:13 -03:00
Pablo Hoffman
ae7333d598
added simplejson optional dependency to doc
2009-07-09 16:49:20 -03:00
Daniel Grana
aba16c20c4
Automated merge with ssh://hg.scrapy.org/scrapy
2009-07-09 14:38:56 -03:00
Daniel Grana
a8de5cef6e
remove xlib hack that appends scrapy/xlib to sys.path
2009-07-09 14:37:59 -03:00
Ismael Carnales
32c25f5a36
complete the newitem tests
2009-07-09 13:03:54 -03:00
Ismael Carnales
25b53df191
merge with trunk
2009-07-09 13:02:49 -03:00
Pablo Hoffman
60e7b80798
removed signal docs from core.signals module, to leave them only in once place (the doc)
2009-07-09 12:57:10 -03:00
Ismael Carnales
f31f75c0e2
remove required attribute from newitem (until we add a validation framework)
2009-07-09 12:54:02 -03:00
Ismael Carnales
9e3e41f946
added more newitem documentation in proposed
2009-07-09 11:29:04 -03:00
Pablo Hoffman
b071681cd4
removed duplicated spiders doc (which used autodoc)
2009-07-09 11:14:33 -03:00
Pablo Hoffman
4f19115a80
removed old setting from default_settings.py, updated doc of CONCURRENT_ITEMS setting
2009-07-09 10:56:15 -03:00
Pablo Hoffman
a4b728f2b2
Scraper: added lower limit for responses sizes, removed redundant line
2009-07-09 10:55:30 -03:00
Pablo Hoffman
8b26e49636
Added new ItemProcessor component to Scraper component
2009-07-08 23:48:06 -03:00
Pablo Hoffman
42b86a385f
removed wtf line
2009-07-08 18:19:54 -03:00
pablo
5cbafaea7f
StackTraceDump extension: using USR2 signal to avoid collision with other stuff that uses USR1 (such as twistd log rotation)
2009-07-08 09:19:35 -03:00
Daniel Grana
b83851dcc3
remove unused lines from shell command
2009-07-07 16:24:59 -03:00
Daniel Grana
8e5ede7179
shell command was broken by recent commits because scrapyengine.crawl does not returns a deferred anymore, now we use scrapyengine.schedule that returns the deferred of the download response
2009-07-07 16:22:23 -03:00
damian
1ba98606c2
test.test_utils_url: update parameter name; utils.url: minor code clean up
2009-07-07 12:35:24 -03:00
damian
460f690c5c
utils.url: add_or_replace_parameter function fixed, quoted urls support and test cases added
2009-07-07 11:20:26 -03:00
pablo
c205f7d8e5
added missing comment for non-trivial code
2009-07-06 20:38:39 -03:00
Daniel Grana
a15dc94340
images: images uploaded trough amazon s3 special spider must be scheduled
2009-07-06 16:16:49 -03:00
Daniel Grana
2e52005847
rewrite RequestLimitMiddleware spidermw so it does not consume spider output at once
...
--HG--
rename : scrapy/contrib/spidermiddleware/limit.py => scrapy/contrib/spidermiddleware/requestlimit.py
2009-07-06 15:35:36 -03:00