elpolilla
e5a764cd2a
. Modified LinkExtractors extract_links for being inconsistent. Moved extra parameters to the constructors.
...
. Renamed ImageLinkExtractor to HtmlLinkExtractor and moved it to scrapy.contrib.link_extractors.
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40599
2009-01-02 14:06:51 +00:00
elpolilla
2979521b63
Reverted change in r590 for inconsistency in the addition operation (doesnt work equally in both ways)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40598
2009-01-02 13:07:18 +00:00
samus_
ebd5f465aa
this changeset improves the extractors' implementation:
...
* moved LinkExtractor.extract_links to a private method and created wrapper in order to be able to work with text directly
* removed fugly new_response_from_xpaths from scrapy.utils.response and replaced it with a better internal algorithm
* moved former _normalize_input from scrapy.utils.iterators to scrapy.utils.response to fill the hole
* turned extractors' output lists into generators; this is safe because the result is always used in for..in constructs
* adapted test for generators (test should be rewritted anyways)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40597
2009-01-02 08:42:36 +00:00
elpolilla
c9e48dc5ed
Disabled ImageLinkExtractor test for triggering mysterious leaks in libxml2
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40596
2009-01-02 04:12:59 +00:00
elpolilla
e0cb3c1ec4
Added test's missing sample file
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40595
2009-01-02 02:36:27 +00:00
elpolilla
91a23e61bf
Renamed LinkExtractors extract_urls method to extract_links
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40594
2009-01-02 02:34:44 +00:00
elpolilla
c82c799d07
Added ImageLinkExtractor's missing docstring
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40593
2009-01-02 02:25:17 +00:00
elpolilla
5757e54a28
- Added repr method to Link objects
...
- Added ImageLinkExtractor and tests
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40592
2009-01-02 02:18:11 +00:00
elpolilla
7bd73e0c09
Modified XPathSelectorLists: adding them should return a new XPathSelectorList
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40591
2008-12-31 13:23:25 +00:00
olveyra
99143ff357
doc string fix
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40590
2008-12-30 19:38:15 +00:00
Ismael Carnales
7e4c9200c2
corrected code blocks
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40589
2008-12-30 15:04:59 +00:00
olveyra
1433af5561
fix to commit (r586)
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40588
2008-12-30 14:51:37 +00:00
olveyra
710247e096
Allow to load a default spider when no spider was found for a given url. Also a generic spider
...
was added in contrib/spiders
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40587
2008-12-30 13:49:00 +00:00
Pablo Hoffman
d047409579
rearranged and sorted out default scrapy settings. closes #20
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40586
2008-12-30 13:34:57 +00:00
Pablo Hoffman
1f247b1efc
added settings documentation topic, and completed available settings reference. closes #30
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40585
2008-12-30 13:28:36 +00:00
olveyra
af79bedf8f
Added doc line in RegexLinkExtractor for the case when no allow/deny argument is given
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40584
2008-12-30 13:04:06 +00:00
Ismael Carnales
8ef8766070
the badge is back to green ... hulk is angry
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40583
2008-12-30 11:27:19 +00:00
Ismael Carnales
ad2c2368a7
using small gray badge
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40582
2008-12-30 11:25:42 +00:00
Ismael Carnales
c149e2de2c
removed menu from footer, added django badge
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40581
2008-12-30 11:16:32 +00:00
Ismael Carnales
c9491b6041
replicating header menu in footer
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40580
2008-12-30 10:56:15 +00:00
Ismael Carnales
81fecc0067
fixed settings indentation and added reference
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40579
2008-12-30 10:45:19 +00:00
Ismael Carnales
36277f3a32
changed settings from description unit to crossreference
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40578
2008-12-29 19:02:56 +00:00
Ismael Carnales
5a038e7d27
added docs breadrumb
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40577
2008-12-29 18:46:43 +00:00
Ismael Carnales
e36f5624a7
fixed caps in news, reordered menu items
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40576
2008-12-29 18:35:13 +00:00
Ismael Carnales
d619a11f88
make sidebar bigger
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40575
2008-12-29 18:28:40 +00:00
Ismael Carnales
2f2c5cf19d
removed django.contrib.comments requirement
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40574
2008-12-29 18:27:16 +00:00
Ismael Carnales
3d85932ae0
moved blog to news
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40573
2008-12-29 18:20:58 +00:00
Ismael Carnales
48682189b4
using a modified version of django simple blog
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40572
2008-12-29 16:50:49 +00:00
Ismael Carnales
a5b609ac5f
moved blog templates to backup folder
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40571
2008-12-29 16:13:19 +00:00
elpolilla
25a38e53d5
Fixed minor encoding issues in adaptors
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40570
2008-12-29 15:54:16 +00:00
elpolilla
bbd2a9d6f0
Fixed typo in test
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40569
2008-12-29 15:17:34 +00:00
elpolilla
5eb7e5ba89
Fixed lots of encoding issues, and improved some adaptors tests
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40568
2008-12-29 15:10:21 +00:00
Pablo Hoffman
fc3f66bd1c
fixed grammar error
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40567
2008-12-29 12:10:01 +00:00
Pablo Hoffman
b5a34dd21d
started writing settings documentation
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40566
2008-12-29 11:38:34 +00:00
Pablo Hoffman
a593af5f53
fixed cyclic import between scrapy.core.engine and scrapy.utils.db
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40565
2008-12-28 08:45:27 +00:00
Pablo Hoffman
8a885e4dc8
moved scrapy-docs svn:external where it belongs
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40564
2008-12-27 21:38:27 +00:00
Pablo Hoffman
06ad08c681
moved dia diagram to docs/media
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40563
2008-12-27 21:32:18 +00:00
Pablo Hoffman
7e5b85ca9e
moved scrapy docs from website source to scrapy source, since it makes more sense there
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40562
2008-12-27 21:27:22 +00:00
Pablo Hoffman
97d4754200
moved docs to docs-old
...
--HG--
rename : scrapy/trunk/docs/scrapy-architecture.dia => scrapy/trunk/docs-old/scrapy-architecture.dia
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40561
2008-12-27 21:20:29 +00:00
Pablo Hoffman
913c34a828
added AUTHORS file
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40560
2008-12-27 19:57:10 +00:00
Pablo Hoffman
77a99999f8
improved text
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40559
2008-12-27 19:16:12 +00:00
Pablo Hoffman
a617c2cb36
some updates to home and download page
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40558
2008-12-27 19:15:13 +00:00
Pablo Hoffman
ed5e25c913
removed tagline from scrapy logo
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40557
2008-12-27 18:21:37 +00:00
Pablo Hoffman
f932a3d842
added spider_exceptions to scrapy stats
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40556
2008-12-27 17:37:05 +00:00
Pablo Hoffman
6fa33f4463
fixed minor bug in scrapy manager
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40555
2008-12-27 02:06:05 +00:00
Pablo Hoffman
e89ad0be8c
fixed minor bug in scrapy manager
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40554
2008-12-27 02:05:05 +00:00
Pablo Hoffman
9557e24203
added start_requests method to BaseSpider, made start_urls empty by default instead of a required attribute in every spider
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40553
2008-12-27 01:07:29 +00:00
elpolilla
5b159d0f04
- Improved unquote_markup by using generators instead of lists
...
- Added possibility of specifying headers in items_to_csv
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40552
2008-12-26 18:53:15 +00:00
elpolilla
f59f1c8bc0
Added items_to_csv function
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40551
2008-12-26 16:10:03 +00:00
elpolilla
c7332cd372
Updated scrapy tutorial
...
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40550
2008-12-26 14:03:45 +00:00