1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-27 19:43:50 +00:00

3050 Commits

Author SHA1 Message Date
Daniel Graña
3b2458dbbb cleanup all FormRequest test cases. #111 #121 2012-04-19 16:48:39 -03:00
Daniel Graña
4340a13db5 More lxml FormRequest fixes. #111 #121
* test textarea elements
* handle odd cases for select elements like chrome and FF browsers does
* Remove test case already covered by per tag test cases
2012-04-19 16:48:31 -03:00
Daniel Graña
84d5f5ea53 test and fix each form input type with border cases. #111 #121 2012-04-19 15:40:07 -03:00
Daniel Graña
63e4355fba find form input elements using one xpath #111 #121 2012-04-19 14:11:13 -03:00
Daniel Graña
5c03c1df2d Merge pull request #121 from artem-dev/master
fix FormRequest for case when a form values are missed
2012-04-19 08:18:02 -07:00
Pablo Hoffman
97f362d64e removed deprecated/undocumented class: HTMLImageLinkExtractor 2012-04-19 12:07:46 -03:00
Artem Bogomyagkov
3a7c28f155 fixed FormRequest for a form missed values case 2012-04-19 18:07:38 +03:00
Pablo Hoffman
13ac6f63eb Merge pull request #119 from andrix/fix-htmlimagelinkextractor
Fix HTMLImageLinkExtractor to work with libxml2 and lxml selectors
2012-04-19 07:56:38 -07:00
Pablo Hoffman
1e521a3ff3 improved command line tool usage help which explains that more commands are available when run from project directory. refs #107 2012-04-19 02:54:57 -03:00
Pablo Hoffman
2fb5e62c39 doc: update overview page to point to the genspider command. refs #107 2012-04-19 02:37:22 -03:00
Pablo Hoffman
1c5294bee1 update docstring in project template to avoid confusion with genspider command, which may be considered as an advanced feature. refs #107 2012-04-19 02:35:48 -03:00
Pablo Hoffman
d567d8efbe added note to docs/topics/firebug.rst about google directory being shut down 2012-04-19 01:34:20 -03:00
Daniel Graña
21e03729a3 lxml is the new default selector backend. closes #120 2012-04-19 00:28:27 -03:00
Daniel Graña
6bb40fe5a8 use xpath to match img tags in one shot. #119 2012-04-19 00:03:04 -03:00
Andrés Moreira
e24107feb8 fix HTMLImageLinkExtractor to work with libxml2 and lxml selectors 2012-04-18 18:05:44 -03:00
Pablo Hoffman
30ddbf624e mention about some scrapy.xlib modules removed in the release notes 2012-04-17 12:31:18 -03:00
Pablo Hoffman
d99ee6deb9 added some missing entries to release notes 2012-04-17 12:29:48 -03:00
Daniel Graña
60ee5d6213 Merge branch 'lxml-formrequest' 2012-04-15 00:47:39 -03:00
Daniel Graña
5b0df465e7 SELECT matched as form inputs but hasnot type attribute. #111 2012-04-15 00:47:30 -03:00
Daniel Graña
150e5734d3 Merge pull request #111 from LucianU/lxml-formrequest
Lxml formrequest
2012-04-13 12:29:04 -07:00
Daniel Graña
39395eb4f7 iteritems returns tuple elements duh!. #111 2012-04-13 16:22:25 -03:00
Daniel Graña
b88bdd05c4 do not add clickable if it is in formdata. #111 2012-04-13 16:20:35 -03:00
Daniel Graña
51e5aadd1d simplify formdata type infering. #111 2012-04-13 16:07:27 -03:00
Daniel Graña
a11ef7fba7 reuse LxmlDocument in FormRequest. #111 2012-04-13 15:41:58 -03:00
Daniel Graña
9e10abcc43 Merge branch 'master' into lxml-formrequest 2012-04-13 14:50:49 -03:00
Daniel Graña
1789a55f28 Merge pull request #117 from dangra/lxml-document
Lxml document
2012-04-13 10:49:35 -07:00
Daniel Graña
4c7d29b7f7 Cache response's element trees using LxmlDocument similar to Libxml2Document 2012-04-13 14:43:30 -03:00
Daniel Graña
ac4f6cc17c unify libxml2 document and factories 2012-04-13 13:54:48 -03:00
Daniel Graña
2904dc2dc0 no need for MultipleElementsFound exception. #111 2012-04-13 13:28:16 -03:00
Daniel Graña
ee1f7847a4 more lxml form fixes and test cases. #111
* Do not treat "coord" attribute specially, just pass "NN,NN" as clickdata value
* Raise explicit ValueError if not clickable is found
* Fix bug looking for clickeables trough xpath when there is more than one form
* Test from_response with multiple clickdata
2012-04-13 13:09:21 -03:00
Daniel Graña
32b9f788be lxml form request cleanup. #111
* remove unused _nons function copied from lxml.html
* compute clickables only if dont_click is False
* less _get_clickables function branch nesting
2012-04-13 12:50:47 -03:00
Daniel Graña
e4d22cb16a reuse form_values() method from lxml to avoid copying code. #111 2012-04-13 10:31:59 -03:00
Daniel Graña
18a35a9fd6 Merge branch 'lxml-selectors' 2012-04-13 09:32:51 -03:00
Daniel Graña
3dbe211d29 lxml boolean results fix. oops #116 2012-04-13 09:29:56 -03:00
Daniel Graña
a338c29287 Merge pull request #116 from dangra/lxml-selectors
more fixes to lxml selector incompatibilities
2012-04-13 04:38:45 -07:00
Daniel Graña
b9efa5ee73 more fixes to lxml selector incompatibilities
* Do not fail parsing empty bodies
* Do not fail parsing bodies with null bytes
* Recode to utf8 using response.body_as_unicode() to avoid decoding bugs
* Return empty results with unevaluable nodes like text or attribute nodes
* Return u'1' and u'0' for boolean xpaths
2012-04-13 00:58:31 -03:00
Daniel Graña
d8ebf16fe5 Merge pull request #114 from stav/master
Scrapy DOC changes
2012-04-11 12:08:45 -07:00
Pablo Hoffman
7cca916ed5 added release notes to official documentation, including all release notes since Scrapy 0.7 2012-04-11 15:53:23 -03:00
Lucian Ursu
c760cc5cd8 Copied lxml.html._nons to not rely on that module's private interface and took out check out of the for loop because it can be done only once 2012-04-11 21:40:00 +03:00
Lucian Ursu
f13a547203 Removed unnecessary iteration of formdata items 2012-04-11 21:08:43 +03:00
stav
f1802289cd small doc typo change to get the fork rolling 2012-04-11 12:05:39 -05:00
Daniel Graña
02833e3265 fix typo in module description. closes #112 2012-04-11 10:44:31 -03:00
Daniel Graña
a0a1a5026b do formdata encoding and serialization in one place. refs #111 2012-04-11 10:07:56 -03:00
Pablo Hoffman
4f28ffcb2c removed no longer needed dependency on simplejson 2012-04-10 16:01:36 -03:00
Pablo Hoffman
6e8edbd72e switched default selectors backend to lxml 2012-04-10 15:52:14 -03:00
Daniel Graña
af0e1c40f5 Avoid logging useless error messages about ignored requests in robots.txt 2012-04-10 13:37:32 -03:00
Lucian Ursu
4be6c22c4d Removed ClientForm with its patch and tests, and BeautifulSoup 2012-04-10 10:24:30 +03:00
Lucian Ursu
df2e795278 Added test case to make sure that ambiguous clickdata is not allowed 2012-04-10 10:19:59 +03:00
Lucian Ursu
eb47849c05 Replaced ClientForm-based FormRequest with a lxml-based implementation 2012-04-10 10:18:54 +03:00
Daniel Graña
97e4003a56 do not fail handling unicode xpaths in libxml2 backed selectors 2012-04-04 17:18:31 -03:00