1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-26 19:43:40 +00:00

874 Commits

Author SHA1 Message Date
samus_
7be6ff0727 typo
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40724
2009-01-14 12:12:37 +00:00
samus_
baf9a8d846 renamed expiration setting to the same used by the image pipeline
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40723
2009-01-14 12:09:06 +00:00
Pablo Hoffman
0ea37f51db * moved request fingerprinting from Request class to scrapy.utils.request - closes #50
* cleaned up fingerprint tests suite (only left relevant tests)

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40722
2009-01-14 01:17:40 +00:00
Pablo Hoffman
519458bdae added documentation for settings: ENGINE_DEBUG, DOWNLOADER_DEBUG
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40721
2009-01-14 00:19:22 +00:00
Pablo Hoffman
64d1f67c57 decreased logging level of RequestLimitMiddleware to DEBUG
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40720
2009-01-13 21:47:49 +00:00
Pablo Hoffman
4ed811b4d3 added DOWNLOAD_DELAY to default_settings and documentation, fixed some typo errors in settings reference
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40719
2009-01-13 14:43:38 +00:00
Pablo Hoffman
0e78003c92 removed my email from CLOSEDOMAIN_NOTIFY setting
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40718
2009-01-13 13:50:51 +00:00
Pablo Hoffman
e316722bb1 updated doc: ref/emails.rst and topics/downloader-middleware.rst
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40717
2009-01-13 11:55:20 +00:00
Pablo Hoffman
468bfeb278 removed unused imports
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40716
2009-01-13 10:10:19 +00:00
Pablo Hoffman
95d99d51b9 renamde old SchedulerStats web console module to ScheduleQueue and made it work with the new PriorityQueue
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40715
2009-01-13 02:49:50 +00:00
Pablo Hoffman
deb960526b removed unused (Django) classes from scrapy.utils.datatypes: MergeDict, SortedDict, DotExpandedDict, FileDict. And also removed unused class gzStringIO
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40714
2009-01-13 01:46:45 +00:00
Pablo Hoffman
ff637d9a0a added __len__ to PriorityQueue/Stack, and changed __iter__ implementation to return (item, priority) tuples, added more test cases
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40713
2009-01-13 01:37:49 +00:00
Pablo Hoffman
2434000cda * ported PriorityQueue and PriorityStack to use heapq instead of queue.Queue +
bisect which was up to 5x slower!
 * added test case for PriorityStack (only PriorityQueue had before)
 * changed Priority{Stack,Queue} API to just push(), pop(), and made them
   iterable

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40712
2009-01-13 01:14:40 +00:00
samus_
8b28d365b1 removed extra return
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40711
2009-01-12 22:43:10 +00:00
Andres Moreira
eca60c7c4d Small change in canonicalize_url improved its performance a bit
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40710
2009-01-12 17:18:54 +00:00
Pablo Hoffman
30e44c9a58 added settings: REQUEST_HEADER_ACCEPT, REQUEST_HEADER_ACCEPT_LANGUAGE. started built-in downloader middleware reference
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40709
2009-01-12 00:53:37 +00:00
Pablo Hoffman
d0046196d8 ported MailSender class to use twisted non-blocking IO
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40708
2009-01-11 23:04:50 +00:00
Pablo Hoffman
b0e37dc36a renamed StackTraceDebug extension to StackTraceDump
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40707
2009-01-11 21:27:38 +00:00
Pablo Hoffman
73074721de improved settings doc
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40706
2009-01-11 20:04:13 +00:00
Pablo Hoffman
dfdc04c28c some email doc improvments
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40705
2009-01-11 19:49:11 +00:00
Pablo Hoffman
09459ed7f7 added logging doc
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40704
2009-01-11 19:48:36 +00:00
Pablo Hoffman
9e93070382 moved email doc to reference (instead of topics)
--HG--
rename : scrapy/trunk/docs/topics/email.rst => scrapy/trunk/docs/ref/email.rst
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40703
2009-01-11 19:14:49 +00:00
Pablo Hoffman
5c15def3a5 added doc for scrapy.mail
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40702
2009-01-11 19:11:17 +00:00
Pablo Hoffman
bf0050c321 added doc for extensions and web console (closes #29 and #33). also started stats doc
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40701
2009-01-11 06:34:38 +00:00
Pablo Hoffman
300a0f4901 minor (and inoffensive) code improvements and fixes found while documenting
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40700
2009-01-11 06:31:07 +00:00
Pablo Hoffman
7cfefc6e70 some minor doc improvements here and there
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40699
2009-01-09 22:45:58 +00:00
Pablo Hoffman
45e812a8bd added misc section to doc
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40698
2009-01-09 22:45:09 +00:00
Pablo Hoffman
0e6b518f35 added FAQ entry about Django
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40697
2009-01-09 21:18:41 +00:00
elpolilla
71f7d62c68 Bugfix in AWSMiddleware regarding requests from local files
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40696
2009-01-09 17:35:04 +00:00
elpolilla
5fa1009712 Improved adaptors documentation
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40695
2009-01-09 13:45:11 +00:00
elpolilla
2cfd292f01 Added needed conversion from unicode to string before using twisted's logging system because it may trigger encoding issues
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40694
2009-01-09 10:49:48 +00:00
Pablo Hoffman
4c54305bee added docstrings to unicode_to_str and str_to_unicode
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40693
2009-01-09 10:35:14 +00:00
samus_
8e38f70a74 small improvement to Response.__init__ testcase
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40692
2009-01-09 01:33:12 +00:00
samus_
d3ac5d5ab8 added test for Response.__init__
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40691
2009-01-09 00:45:12 +00:00
samus_
8240fc48ad refactored ResponseBody's encoding test
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40690
2009-01-09 00:42:44 +00:00
samus_
1feace1bb0 a bit of performance
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40689
2009-01-08 18:29:56 +00:00
samus_
d0e5245fba fixed bug
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40688
2009-01-08 18:29:33 +00:00
samus_
c601bbb083 fixed bug in copy method of Response (tests coming soon)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40687
2009-01-08 16:08:04 +00:00
elpolilla
5cb9f344ac Changed method process_spider_output name to process_results in crawl and feed spiders
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40686
2009-01-08 15:42:33 +00:00
Pablo Hoffman
6568dc4f35 added custom CSS for scrapy doc with minor modifications (colors only)
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40685
2009-01-08 15:02:39 +00:00
Pablo Hoffman
6420c407d2 removed scrapyengine import from downoader code, minor improvements to docstrings
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40684
2009-01-08 14:21:20 +00:00
elpolilla
f39b7b507a Disabled test until memory leak in libxml is fixed
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40683
2009-01-08 13:46:14 +00:00
Pablo Hoffman
fc76af2eb4 reverted r680 until we there are tests and documentation available about the change
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40682
2009-01-08 13:30:25 +00:00
elpolilla
4a9a48a629 . Added strip() to link texts in LinkExtractor in order to avoid extracting line breaks and such
. Added test for restrict_xpaths in RegexLinkExtractor

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40681
2009-01-08 11:05:15 +00:00
elpolilla
8278c0cffc Changed process_results name in feed spiders to process_spider_output and moved its parameters to a more standard order
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40680
2009-01-08 10:11:56 +00:00
Pablo Hoffman
32d842326a reverted r675 which broke precedence for environment variables
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40679
2009-01-07 18:06:39 +00:00
Pablo Hoffman
393cf349a0 updated settings doc
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40678
2009-01-07 18:04:40 +00:00
elpolilla
382621b814 Fixed bug in RegexLinkExtractor. Encoding was not being specified
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40677
2009-01-07 17:39:09 +00:00
elpolilla
307f8b3321 . Modified loading of command-specific settings (were loaded as defaults, now as overrides)
. Removed duplicated loading of extensions in shell command

--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40676
2009-01-07 17:31:06 +00:00
elpolilla
8029275c75 Removed non-scrapy import in shell command
--HG--
extra : convert_revision : svn%3Ab85faa78-f9eb-468e-a121-7cced6da292c%40675
2009-01-07 14:18:03 +00:00