1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-27 00:04:25 +00:00

787 Commits

Author SHA1 Message Date
elpolilla
ea439e4b03 Updated Selectors documentation
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40637
2009-01-04 01:15:08 +00:00
elpolilla
a5025899ab Updated Spiders documentation
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40636
2009-01-04 00:36:16 +00:00
Pablo Hoffman
f8f0db8bd3 doc: several more improvements
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40635
2009-01-03 09:14:52 +00:00
Pablo Hoffman
137429e518 doc: added README
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40634
2009-01-03 07:54:33 +00:00
Pablo Hoffman
16f9f5a9ef doc: added topic about robots.txt, added ROBOTSTXT_OBEY setting, added missing REQUESTS_PER_DOMAIN setting
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40633
2009-01-03 07:41:43 +00:00
Pablo Hoffman
b59acb18ff minor typo and grammar corrections to docs/faq
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40632
2009-01-03 07:40:40 +00:00
Pablo Hoffman
4a3ba6957b updated default_settings with robots.txt related settings
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40631
2009-01-03 07:36:35 +00:00
Pablo Hoffman
e437ff67ae finished working version of robots.txt downloader middleware, and renamed the module/class
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40630
2009-01-03 07:35:30 +00:00
Pablo Hoffman
8a404c4480 removed deprecated Request.__getitem__ method
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40629
2009-01-03 06:15:30 +00:00
Pablo Hoffman
8d1d5102be updated downloader-middleware to link to Request and Response classes
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40628
2009-01-03 03:11:08 +00:00
Pablo Hoffman
8b10d0183a added (incomplete) request-response doc
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40627
2009-01-03 03:10:14 +00:00
Pablo Hoffman
44f6358ace renamed itempipeline.rst to item-pipeline.rst
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40626
2009-01-03 03:05:19 +00:00
Pablo Hoffman
1a9845b3e8 added documentation for downloader middleware. closes #27
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40625
2009-01-03 01:29:09 +00:00
Pablo Hoffman
093c7e121b docs: more updates to Makefile and conf.py
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40624
2009-01-03 01:25:03 +00:00
Pablo Hoffman
14697b5fba added FAQ to docs
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40623
2009-01-03 01:24:18 +00:00
Pablo Hoffman
d06fc99101 more stripping down of doc Makefile
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40622
2009-01-03 00:58:09 +00:00
Pablo Hoffman
144eaa64b4 added stripped down version of Python documentation Makefile
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40621
2009-01-03 00:43:04 +00:00
Pablo Hoffman
541451afb3 simplified obfuscated implementation of duplicated links removal and made it use scrapy.utils.python.unique function. also added missing 'unique' argument to docstring
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40620
2009-01-02 20:45:34 +00:00
Pablo Hoffman
c69d94fb45 removed unused private variable
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40619
2009-01-02 20:44:07 +00:00
Pablo Hoffman
f8a6e2fc7c added key argument to unique function
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40618
2009-01-02 20:42:20 +00:00
Pablo Hoffman
e1c738a49e minor text changes to home page
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40617
2009-01-02 20:17:40 +00:00
Pablo Hoffman
aa92755344 changed scrapy.org title to include more key words
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40616
2009-01-02 20:07:24 +00:00
Pablo Hoffman
371890008b removed ugly apostrophe
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40615
2009-01-02 20:06:17 +00:00
Pablo Hoffman
fc36ad002f renamed ItemPipeline class to ItemPipelineManager to avoid confusions
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40614
2009-01-02 19:59:41 +00:00
Pablo Hoffman
dd2f92f040 added comment about new docs
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40613
2009-01-02 19:55:22 +00:00
Pablo Hoffman
e95a28ef3d added documentation for item pipeline, signals and exceptions. changed scrapy version to 0.7 in sphinx conf. closes #26, #40, #41
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40612
2009-01-02 19:48:31 +00:00
Pablo Hoffman
3789f064ff added -E to sphinx-build so source rst's are be always reloaded and warning messages printed
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40611
2009-01-02 19:45:51 +00:00
elpolilla
47881c6e64 . Modified unicode conversion functions to accept the conversion encoding as a parameter
. Modified safe_url_string to make use of this change

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40610
2009-01-02 18:59:34 +00:00
elpolilla
63662ba6ff . Added tests for RegexLinkExtractor
. Modified LinkExtractor in order to percent-encode urls using the response encoding
. Improved LinkExtractors processing of unique links

--HG--
rename : scrapy/trunk/scrapy/tests/sample_data/image_linkextractor.html => scrapy/trunk/scrapy/tests/sample_data/link_extractor/image_linkextractor.html
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40609
2009-01-02 18:32:55 +00:00
elpolilla
c791155aac Modified url_safe_string in order to accept strings encoded in any encoding apart from unicode objects
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40608
2009-01-02 18:31:55 +00:00
Pablo Hoffman
95495f1d9d removed domain_initialized signal which was defined but never used
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40607
2009-01-02 18:24:02 +00:00
Pablo Hoffman
12b56a3e39 removed border-bottom on link hover which made trac behave strangly
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40606
2009-01-02 17:15:22 +00:00
Pablo Hoffman
f7e4b91fe7 more rearrangements to scrapy doc
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40605
2009-01-02 16:34:10 +00:00
samus_
6a4a4a33ca created test for body_or_str (forgot to add the file to the repo at r601)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40604
2009-01-02 16:28:05 +00:00
Pablo Hoffman
a0f2ce0b53 rearranged doc according to what we agreed on a meeting with elpolilla and ismael
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40603
2009-01-02 16:25:28 +00:00
samus_
d706182e97 created test for body_or_str and made small improvement to scrapy.http.response
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40602
2009-01-02 16:21:28 +00:00
Pablo Hoffman
808923f5fd added topics section to docs, changed reference to ref
--HG--
rename : scrapy/trunk/docs/reference/index.rst => scrapy/trunk/docs/ref/index.rst
rename : scrapy/trunk/docs/reference/settings.rst => scrapy/trunk/docs/ref/settings.rst
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40601
2009-01-02 16:08:18 +00:00
samus_
c1edf0e381 reverted generator approach because it conflicts with unique parameter plus fixed bug in canonicalize routine
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40600
2009-01-02 14:57:33 +00:00
elpolilla
e5a764cd2a . Modified LinkExtractors extract_links for being inconsistent. Moved extra parameters to the constructors.
. Renamed ImageLinkExtractor to HtmlLinkExtractor and moved it to scrapy.contrib.link_extractors.

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40599
2009-01-02 14:06:51 +00:00
elpolilla
2979521b63 Reverted change in r590 for inconsistency in the addition operation (doesnt work equally in both ways)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40598
2009-01-02 13:07:18 +00:00
samus_
ebd5f465aa this changeset improves the extractors' implementation:
* moved LinkExtractor.extract_links to a private method and created wrapper in order to be able to work with text directly
* removed fugly new_response_from_xpaths from scrapy.utils.response and replaced it with a better internal algorithm
* moved former _normalize_input from scrapy.utils.iterators to scrapy.utils.response to fill the hole
* turned extractors' output lists into generators; this is safe because the result is always used in for..in constructs
* adapted test for generators (test should be rewritted anyways)

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40597
2009-01-02 08:42:36 +00:00
elpolilla
c9e48dc5ed Disabled ImageLinkExtractor test for triggering mysterious leaks in libxml2
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40596
2009-01-02 04:12:59 +00:00
elpolilla
e0cb3c1ec4 Added test's missing sample file
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40595
2009-01-02 02:36:27 +00:00
elpolilla
91a23e61bf Renamed LinkExtractors extract_urls method to extract_links
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40594
2009-01-02 02:34:44 +00:00
elpolilla
c82c799d07 Added ImageLinkExtractor's missing docstring
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40593
2009-01-02 02:25:17 +00:00
elpolilla
5757e54a28 - Added repr method to Link objects
- Added ImageLinkExtractor and tests

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40592
2009-01-02 02:18:11 +00:00
elpolilla
7bd73e0c09 Modified XPathSelectorLists: adding them should return a new XPathSelectorList
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40591
2008-12-31 13:23:25 +00:00
olveyra
99143ff357 doc string fix
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40590
2008-12-30 19:38:15 +00:00
Ismael Carnales
7e4c9200c2 corrected code blocks
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40589
2008-12-30 15:04:59 +00:00
olveyra
1433af5561 fix to commit (r586)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40588
2008-12-30 14:51:37 +00:00