1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-26 15:04:37 +00:00

1503 Commits

Author SHA1 Message Date
Pablo Hoffman
82e4b6adcf merge with ismael repo 2009-08-18 15:38:20 -03:00
Pablo Hoffman
e01d31c498 another improvement to doc navbar 2009-08-18 15:35:53 -03:00
Ismael Carnales
be4226cec5 merge 2009-08-18 15:21:39 -03:00
Ismael Carnales
f88fb27851 fixed error in xpath selectors doc 2009-08-18 15:18:49 -03:00
Ismael Carnales
67c0c6a9e4 corrected indentation in xpath selectors doc 2009-08-18 15:13:23 -03:00
Pablo Hoffman
ff837e5a45 doc: improved top navbar 2009-08-18 15:12:44 -03:00
Ismael Carnales
428dfe0d4a corrected the style of spiders documentation 2009-08-18 15:06:33 -03:00
Pablo Hoffman
0192282d07 reorganized doc and moved robotstxt doc inside downloader middlewares doc 2009-08-18 14:36:18 -03:00
Ismael Carnales
33089d287d merged topics and reference doc 2009-08-18 14:05:15 -03:00
Pablo Hoffman
e82d9d885e some speedups to offsite spider middleware using regexes and urlparse_cached 2009-08-18 12:43:25 -03:00
Pablo Hoffman
7f30461410 added support for defining EXTENSIONS setting using dicts, like middleware settings
--HG--
rename : scrapy/tests/test_utils_middleware.py => scrapy/tests/test_utils_conf.py
rename : scrapy/utils/middleware.py => scrapy/utils/conf.py
2009-08-18 11:05:36 -03:00
Ismael Carnales
1bbe7991dc added documentation for ImagesPipeline 2009-08-18 09:35:32 -03:00
Ismael Carnales
ec8f934a8e corrected import path in scrapy-admin.py 2009-08-18 09:02:23 -03:00
Pablo Hoffman
4b910881da make sure get_vmvalue_from_procfs returns int 2009-08-18 00:59:32 -03:00
Daniel Grana
49cc1b1e7e add missing future import for python 2.5 2009-08-17 21:42:37 -03:00
Pablo Hoffman
16c51353f2 updated select() method in crawl spider template 2009-08-17 21:22:05 -03:00
Pablo Hoffman
9ae3a1946d remove Url class and use str instead for Request and Response urls. Also added urlparse_cached function for achieving the same caching functionality provided by old Url class 2009-08-17 21:16:55 -03:00
Pablo Hoffman
29710461f6 some refactoring to robotstxt downloader middleware 2009-08-17 19:11:43 -03:00
Pablo Hoffman
d6da7eb04f removed backwards compatibility alias: load_class 2009-08-17 18:32:24 -03:00
Pablo Hoffman
e71764e707 removed unused functions: memoize, gzip_file 2009-08-17 18:30:40 -03:00
Pablo Hoffman
50de128e0d removed unused items_to_csv function 2009-08-17 18:21:21 -03:00
Pablo Hoffman
6dfda357ef removed unused dict_updatedefault function 2009-08-17 18:19:59 -03:00
Pablo Hoffman
c81987f624 removed unused hash_values function 2009-08-17 18:16:44 -03:00
Pablo Hoffman
4f06f6ef41 applied fix to deprecated decorator to warn only once (thanks Dan) 2009-08-17 18:13:45 -03:00
Pablo Hoffman
41036af643 some refactoring to genspider command 2009-08-17 17:59:34 -03:00
Ismael Carnales
48b40bd620 renamed x method of selectors to select 2009-08-17 15:58:06 -03:00
Pablo Hoffman
59e0a83ad4 removed more obsolete adaptors code 2009-08-17 14:48:11 -03:00
Pablo Hoffman
1bf5682144 removed unused modules 2009-08-17 14:41:06 -03:00
Pablo Hoffman
62c29e4d69 removed unused module 2009-08-17 14:39:54 -03:00
Pablo Hoffman
db8b29500f removed duplicated code from memdebug extension (already present in memusage extension) 2009-08-17 14:25:38 -03:00
Pablo Hoffman
b20837a028 added __slots__ to Request/Response/Headers objects, to reduce memory footprint 2009-08-17 13:01:33 -03:00
Pablo Hoffman
45ed662ee5 some cleanup to memusage and memdebug extensions 2009-08-17 09:41:02 -03:00
Pablo Hoffman
f1980a3d9f updated some RFC numbers 2009-08-15 20:54:34 -03:00
Pablo Hoffman
08424ee093 OffsiteMiddleware: isolate policy of urls belonging to spiders into a separate method 2009-08-15 20:44:11 -03:00
Pablo Hoffman
bc87350133 removed legacy comment, and wrapped some lines to 80 columns 2009-08-15 20:34:40 -03:00
Pablo Hoffman
08194186b4 improved docstring a encoding parameter of safe_url_string function. also added some unittests 2009-08-15 19:44:37 -03:00
Pablo Hoffman
2463842acb HttpErrorMiddleware: performance improvement and added support for 'handle_httpstatus_list' request meta key 2009-08-15 17:14:35 -03:00
Ismael Carnales
8c2f62ba9c added Item Exporters documentation 2009-08-14 09:16:29 -03:00
Daniel Grana
7fca2f26f6 Automated merge with ssh://hg.scrapy.org/scrapy 2009-08-14 01:34:35 -03:00
Daniel Grana
e0ccbed40d imported patch improve_download_troughput.patch 2009-08-14 01:33:36 -03:00
Pablo Hoffman
02b01c3d2b loaders doc: fixed outdated line 2009-08-13 23:24:24 -03:00
Daniel Grana
6a74b51371 remove dupes words in loaders doc, and unused import in example
--HG--
extra : rebase_source : ea7886725af7a54bd1031cb28f92efbdfe921d9e
2009-08-13 23:23:08 -03:00
Pablo Hoffman
5494237cbb upgraded bundled beautifulsoup to 3.0.7a 2009-08-13 22:30:07 -03:00
Pablo Hoffman
23d0c08c65 added missing module in previous commit 2009-08-13 22:29:13 -03:00
Pablo Hoffman
d6f4e382ae moved scrapy.utils.db module to scrapy.utils.mysql
--HG--
rename : scrapy/utils/db.py => scrapy/utils/mysql.py
2009-08-13 22:14:55 -03:00
Pablo Hoffman
8be62e2508 commented out line until we find a proper fix 2009-08-13 22:11:30 -03:00
Pablo Hoffman
d8722c8c34 cleaned up scrapy.utils.db module 2009-08-13 22:09:19 -03:00
Pablo Hoffman
bebc8b2027 removed obsolete scrapy.contrib.item module (RobustScrapedItem model) 2009-08-13 21:50:41 -03:00
Pablo Hoffman
1624981443 added tests for builtin loader processors 2009-08-13 15:33:20 -03:00
Ismael Carnales
066fd9fa84 renamed internal names of Item Loader 2009-08-13 13:32:02 -03:00