Daniel Graña
3b2458dbbb
cleanup all FormRequest test cases. #111 #121
2012-04-19 16:48:39 -03:00
Daniel Graña
4340a13db5
More lxml FormRequest fixes. #111 #121
...
* test textarea elements
* handle odd cases for select elements like chrome and FF browsers does
* Remove test case already covered by per tag test cases
2012-04-19 16:48:31 -03:00
Daniel Graña
84d5f5ea53
test and fix each form input type with border cases. #111 #121
2012-04-19 15:40:07 -03:00
Daniel Graña
63e4355fba
find form input elements using one xpath #111 #121
2012-04-19 14:11:13 -03:00
Daniel Graña
5c03c1df2d
Merge pull request #121 from artem-dev/master
...
fix FormRequest for case when a form values are missed
2012-04-19 08:18:02 -07:00
Pablo Hoffman
97f362d64e
removed deprecated/undocumented class: HTMLImageLinkExtractor
2012-04-19 12:07:46 -03:00
Artem Bogomyagkov
3a7c28f155
fixed FormRequest for a form missed values case
2012-04-19 18:07:38 +03:00
Pablo Hoffman
13ac6f63eb
Merge pull request #119 from andrix/fix-htmlimagelinkextractor
...
Fix HTMLImageLinkExtractor to work with libxml2 and lxml selectors
2012-04-19 07:56:38 -07:00
Pablo Hoffman
1e521a3ff3
improved command line tool usage help which explains that more commands are available when run from project directory. refs #107
2012-04-19 02:54:57 -03:00
Pablo Hoffman
2fb5e62c39
doc: update overview page to point to the genspider command. refs #107
2012-04-19 02:37:22 -03:00
Pablo Hoffman
1c5294bee1
update docstring in project template to avoid confusion with genspider command, which may be considered as an advanced feature. refs #107
2012-04-19 02:35:48 -03:00
Pablo Hoffman
d567d8efbe
added note to docs/topics/firebug.rst about google directory being shut down
2012-04-19 01:34:20 -03:00
Daniel Graña
21e03729a3
lxml is the new default selector backend. closes #120
2012-04-19 00:28:27 -03:00
Daniel Graña
6bb40fe5a8
use xpath to match img tags in one shot. #119
2012-04-19 00:03:04 -03:00
Andrés Moreira
e24107feb8
fix HTMLImageLinkExtractor to work with libxml2 and lxml selectors
2012-04-18 18:05:44 -03:00
Pablo Hoffman
30ddbf624e
mention about some scrapy.xlib modules removed in the release notes
2012-04-17 12:31:18 -03:00
Pablo Hoffman
d99ee6deb9
added some missing entries to release notes
2012-04-17 12:29:48 -03:00
Daniel Graña
60ee5d6213
Merge branch 'lxml-formrequest'
2012-04-15 00:47:39 -03:00
Daniel Graña
5b0df465e7
SELECT matched as form inputs but hasnot type attribute. #111
2012-04-15 00:47:30 -03:00
Daniel Graña
150e5734d3
Merge pull request #111 from LucianU/lxml-formrequest
...
Lxml formrequest
2012-04-13 12:29:04 -07:00
Daniel Graña
39395eb4f7
iteritems returns tuple elements duh!. #111
2012-04-13 16:22:25 -03:00
Daniel Graña
b88bdd05c4
do not add clickable if it is in formdata. #111
2012-04-13 16:20:35 -03:00
Daniel Graña
51e5aadd1d
simplify formdata type infering. #111
2012-04-13 16:07:27 -03:00
Daniel Graña
a11ef7fba7
reuse LxmlDocument in FormRequest. #111
2012-04-13 15:41:58 -03:00
Daniel Graña
9e10abcc43
Merge branch 'master' into lxml-formrequest
2012-04-13 14:50:49 -03:00
Daniel Graña
1789a55f28
Merge pull request #117 from dangra/lxml-document
...
Lxml document
2012-04-13 10:49:35 -07:00
Daniel Graña
4c7d29b7f7
Cache response's element trees using LxmlDocument similar to Libxml2Document
2012-04-13 14:43:30 -03:00
Daniel Graña
ac4f6cc17c
unify libxml2 document and factories
2012-04-13 13:54:48 -03:00
Daniel Graña
2904dc2dc0
no need for MultipleElementsFound exception. #111
2012-04-13 13:28:16 -03:00
Daniel Graña
ee1f7847a4
more lxml form fixes and test cases. #111
...
* Do not treat "coord" attribute specially, just pass "NN,NN" as clickdata value
* Raise explicit ValueError if not clickable is found
* Fix bug looking for clickeables trough xpath when there is more than one form
* Test from_response with multiple clickdata
2012-04-13 13:09:21 -03:00
Daniel Graña
32b9f788be
lxml form request cleanup. #111
...
* remove unused _nons function copied from lxml.html
* compute clickables only if dont_click is False
* less _get_clickables function branch nesting
2012-04-13 12:50:47 -03:00
Daniel Graña
e4d22cb16a
reuse form_values() method from lxml to avoid copying code. #111
2012-04-13 10:31:59 -03:00
Daniel Graña
18a35a9fd6
Merge branch 'lxml-selectors'
2012-04-13 09:32:51 -03:00
Daniel Graña
3dbe211d29
lxml boolean results fix. oops #116
2012-04-13 09:29:56 -03:00
Daniel Graña
a338c29287
Merge pull request #116 from dangra/lxml-selectors
...
more fixes to lxml selector incompatibilities
2012-04-13 04:38:45 -07:00
Daniel Graña
b9efa5ee73
more fixes to lxml selector incompatibilities
...
* Do not fail parsing empty bodies
* Do not fail parsing bodies with null bytes
* Recode to utf8 using response.body_as_unicode() to avoid decoding bugs
* Return empty results with unevaluable nodes like text or attribute nodes
* Return u'1' and u'0' for boolean xpaths
2012-04-13 00:58:31 -03:00
Daniel Graña
d8ebf16fe5
Merge pull request #114 from stav/master
...
Scrapy DOC changes
2012-04-11 12:08:45 -07:00
Pablo Hoffman
7cca916ed5
added release notes to official documentation, including all release notes since Scrapy 0.7
2012-04-11 15:53:23 -03:00
Lucian Ursu
c760cc5cd8
Copied lxml.html._nons to not rely on that module's private interface and took out check out of the for loop because it can be done only once
2012-04-11 21:40:00 +03:00
Lucian Ursu
f13a547203
Removed unnecessary iteration of formdata items
2012-04-11 21:08:43 +03:00
stav
f1802289cd
small doc typo change to get the fork rolling
2012-04-11 12:05:39 -05:00
Daniel Graña
02833e3265
fix typo in module description. closes #112
2012-04-11 10:44:31 -03:00
Daniel Graña
a0a1a5026b
do formdata encoding and serialization in one place. refs #111
2012-04-11 10:07:56 -03:00
Pablo Hoffman
4f28ffcb2c
removed no longer needed dependency on simplejson
2012-04-10 16:01:36 -03:00
Pablo Hoffman
6e8edbd72e
switched default selectors backend to lxml
2012-04-10 15:52:14 -03:00
Daniel Graña
af0e1c40f5
Avoid logging useless error messages about ignored requests in robots.txt
2012-04-10 13:37:32 -03:00
Lucian Ursu
4be6c22c4d
Removed ClientForm with its patch and tests, and BeautifulSoup
2012-04-10 10:24:30 +03:00
Lucian Ursu
df2e795278
Added test case to make sure that ambiguous clickdata is not allowed
2012-04-10 10:19:59 +03:00
Lucian Ursu
eb47849c05
Replaced ClientForm-based FormRequest with a lxml-based implementation
2012-04-10 10:18:54 +03:00
Daniel Graña
97e4003a56
do not fail handling unicode xpaths in libxml2 backed selectors
2012-04-04 17:18:31 -03:00