1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-26 07:03:49 +00:00

1389 Commits

Author SHA1 Message Date
Pablo Hoffman
7fb889349e fixed link to experimental doc 2009-07-14 08:46:15 -03:00
Pablo Hoffman
f1dc4e4af2 corrected TextField examples 2009-07-14 08:39:38 -03:00
Daniel Grana
bffcd45361 Automated merge with ssh://hg.scrapy.org/scrapy 2009-07-14 02:29:59 -03:00
Daniel Grana
95a0b1c614 split next_request method to allow calling it even when backout is needed 2009-07-14 02:29:43 -03:00
Pablo Hoffman
ae82f8c70d made BaseField.to_python() raise NotImplementedError (already documented) and adapted unittest 2009-07-13 22:19:25 -03:00
Pablo Hoffman
a8aae41a13 removed unused files 2009-07-13 22:16:43 -03:00
Pablo Hoffman
5c39d173a5 merged proposed and experimental documentation, as it didn't make sense to keep two separate sections
--HG--
rename : docs/proposed/_images/scrapy_architecture.odg => docs/experimental/_images/scrapy_architecture.odg
rename : docs/proposed/_images/scrapy_architecture.png => docs/experimental/_images/scrapy_architecture.png
rename : docs/proposed/index.rst => docs/experimental/index.rst
rename : docs/proposed/newitem-fields.rst => docs/experimental/newitem-fields.rst
rename : docs/proposed/newitem.rst => docs/experimental/newitem.rst
2009-07-13 22:15:54 -03:00
Pablo Hoffman
26bb8ef608 doc: improved newitem fields reference 2009-07-13 22:05:48 -03:00
Pablo Hoffman
2bf39b7cdb minor layout cleanups to newitem doc 2009-07-13 22:05:18 -03:00
Pablo Hoffman
74fcfc2cfc deprecated old adaptors documentation 2009-07-13 22:03:56 -03:00
Pablo Hoffman
c73ff8198b newitem fields: dropped support in to_python() for converting from None for default value, improved raising of TypeError instead of ValueError when appropiate, added and adapted unittests 2009-07-13 21:10:29 -03:00
Ismael Carnales
72457c3e4e better handling of default value in newitem 2009-07-13 17:03:38 -03:00
Ismael Carnales
47d937f36b only accept unicode strings in text fields 2009-07-13 15:54:48 -03:00
Ismael Carnales
d75afaa161 renamed StringField to TextField 2009-07-13 15:54:46 -03:00
Pablo Hoffman
8634a0d181 more efficient Item implementation and added support for using custom methods (unittests included) 2009-07-13 14:00:41 -03:00
Pablo Hoffman
dff510384b doc: updated SCHEDULER_MIDDLEWARES_BASE setting 2009-07-13 13:33:47 -03:00
Ismael Carnales
b44409a203 added TimeField to newitem 2009-07-13 10:31:32 -03:00
Pablo Hoffman
5eebe1f405 fixed bug in fetcher caused by recent spider manager changes (thanks andres) 2009-07-13 00:04:00 -03:00
Pablo Hoffman
e3fe0ef297 Some changes to newitem API and implementation:
- Dropped support for wildcard importing from newitem package (must now import
  from newitem.fields and don't use wildcard)
- Removed assign() method from Fields as it was apparently redundant (with
  to_python() method) and I couldn't find any reason for keeping it (neither in
  the docs nor in the tests)
- Moved deiter() method of Field to StringField, as its both its purpose and
  implementation was specific for strings. if it's really needed as a general
  purpose method, it could be restored. Also, no unittest was broken because of
  this change, which sort-of reinforces my point.
- Renamed (previously mentioned) StringField.deiter() method to
  StringField.to_single(), for better consistency with to_python() method
- Removed Field class as it was useless without the deiter() functionality (now
  belonging to StringField class)
- Moved ansi_date_re module variable to DateField class attribute
- Simplified implementation of DecimalField, FloatField and IntegerField to one
  line of code (using tests to make sure not to break any functionality)
- Renamed ItemMeta class (in models.py) to _ItemMeta to highlight its protected
  state (should not be externally imported)
- Added support for instantiating new items with dicts, to support
  deserializing items with their repr() string
- Added unittests for new functionality introduced
2009-07-11 22:19:56 -03:00
Pablo Hoffman
5054b67a02 improved newitems doc and marked robust scraped items as deprecated 2009-07-11 21:26:52 -03:00
Pablo Hoffman
1a153d47f3 improved invalid xpath exception message in xpath selectors, and added unittests 2009-07-11 17:19:20 -03:00
Pablo Hoffman
fc64360a34 removed unused lines 2009-07-11 16:42:32 -03:00
Pablo Hoffman
e3caf00d7a simplified implementation of spider manager by removing knowledge of enabled spiders 2009-07-10 16:41:02 -03:00
dgrana
4cd1fa9c32 generate dropin.cache for spiders under tests 2009-07-10 05:29:27 +01:00
Pablo Hoffman
9270810840 improved usage of urljoin_rfc function, adding unittests and encoding where needed 2009-07-09 18:45:40 -03:00
Daniel Grana
d5d2c5c924 update documentation to recent pydispatcher import path change 2009-07-09 17:13:30 -03:00
Daniel Grana
18fbd7c7eb Automated merge with ssh://hg.scrapy.org/scrapy 2009-07-09 16:58:07 -03:00
Daniel Grana
eff8ea6173 remove response from item_passed and item_dropped signal api 2009-07-09 16:57:03 -03:00
Pablo Hoffman
5da32d9f6d fixed Sphinx warning 2009-07-09 16:50:13 -03:00
Pablo Hoffman
ae7333d598 added simplejson optional dependency to doc 2009-07-09 16:49:20 -03:00
Daniel Grana
aba16c20c4 Automated merge with ssh://hg.scrapy.org/scrapy 2009-07-09 14:38:56 -03:00
Daniel Grana
a8de5cef6e remove xlib hack that appends scrapy/xlib to sys.path 2009-07-09 14:37:59 -03:00
Ismael Carnales
32c25f5a36 complete the newitem tests 2009-07-09 13:03:54 -03:00
Ismael Carnales
25b53df191 merge with trunk 2009-07-09 13:02:49 -03:00
Pablo Hoffman
60e7b80798 removed signal docs from core.signals module, to leave them only in once place (the doc) 2009-07-09 12:57:10 -03:00
Ismael Carnales
f31f75c0e2 remove required attribute from newitem (until we add a validation framework) 2009-07-09 12:54:02 -03:00
Ismael Carnales
9e3e41f946 added more newitem documentation in proposed 2009-07-09 11:29:04 -03:00
Pablo Hoffman
b071681cd4 removed duplicated spiders doc (which used autodoc) 2009-07-09 11:14:33 -03:00
Pablo Hoffman
4f19115a80 removed old setting from default_settings.py, updated doc of CONCURRENT_ITEMS setting 2009-07-09 10:56:15 -03:00
Pablo Hoffman
a4b728f2b2 Scraper: added lower limit for responses sizes, removed redundant line 2009-07-09 10:55:30 -03:00
Pablo Hoffman
8b26e49636 Added new ItemProcessor component to Scraper component 2009-07-08 23:48:06 -03:00
Pablo Hoffman
42b86a385f removed wtf line 2009-07-08 18:19:54 -03:00
pablo
5cbafaea7f StackTraceDump extension: using USR2 signal to avoid collision with other stuff that uses USR1 (such as twistd log rotation) 2009-07-08 09:19:35 -03:00
Daniel Grana
b83851dcc3 remove unused lines from shell command 2009-07-07 16:24:59 -03:00
Daniel Grana
8e5ede7179 shell command was broken by recent commits because scrapyengine.crawl does not returns a deferred anymore, now we use scrapyengine.schedule that returns the deferred of the download response 2009-07-07 16:22:23 -03:00
damian
1ba98606c2 test.test_utils_url: update parameter name; utils.url: minor code clean up 2009-07-07 12:35:24 -03:00
damian
460f690c5c utils.url: add_or_replace_parameter function fixed, quoted urls support and test cases added 2009-07-07 11:20:26 -03:00
pablo
c205f7d8e5 added missing comment for non-trivial code 2009-07-06 20:38:39 -03:00
Daniel Grana
a15dc94340 images: images uploaded trough amazon s3 special spider must be scheduled 2009-07-06 16:16:49 -03:00
Daniel Grana
2e52005847 rewrite RequestLimitMiddleware spidermw so it does not consume spider output at once
--HG--
rename : scrapy/contrib/spidermiddleware/limit.py => scrapy/contrib/spidermiddleware/requestlimit.py
2009-07-06 15:35:36 -03:00