1
0
mirror of https://github.com/scrapy/scrapy.git synced 2025-02-27 23:43:46 +00:00

3409 Commits

Author SHA1 Message Date
Daniel Graña
28401fd47d Merge dev.scrapinghub.com:~/src/scrapy 2012-04-20 09:32:43 -03:00
Daniel Graña
29d6bcf0d1 Workaround bug in lxml for multiple select options
In lxml version pre 2.3.1 there is a bug that returns
all options elements instead of those selected for select tag

it is mentioned in 2.3.1 release notes
http://lxml.de/2.3/changes-2.3.1.html

and fixed by
57f49eed82 (L1R1139)
2012-04-20 12:24:30 +00:00
Pablo Hoffman
c7f33c534b removed redundant lines from release notes 2012-04-20 09:09:55 -03:00
Daniel Graña
15c8c01828 add 0.14.3 release notes 2012-04-19 23:25:15 -03:00
Daniel Graña
f34dd11c16 forgot to include pydispatch license. #118 2012-04-19 22:49:47 -03:00
Daniel Graña
771abd57c1 include egg files used by testsuite in source distribution. #118 2012-04-19 22:38:45 -03:00
Daniel Graña
8b45a00f36 Merge dangra/lxml-formrequest 2012-04-19 17:10:29 -03:00
Daniel Graña
72485128cb Add a test case to cover input elements as not direct child of form element. #111 #121 2012-04-19 16:58:01 -03:00
Daniel Graña
3b2458dbbb cleanup all FormRequest test cases. #111 #121 2012-04-19 16:48:39 -03:00
Daniel Graña
4340a13db5 More lxml FormRequest fixes. #111 #121
* test textarea elements
* handle odd cases for select elements like chrome and FF browsers does
* Remove test case already covered by per tag test cases
2012-04-19 16:48:31 -03:00
Daniel Graña
84d5f5ea53 test and fix each form input type with border cases. #111 #121 2012-04-19 15:40:07 -03:00
Daniel Graña
63e4355fba find form input elements using one xpath #111 #121 2012-04-19 14:11:13 -03:00
Daniel Graña
5c03c1df2d Merge pull request #121 from artem-dev/master
fix FormRequest for case when a form values are missed
2012-04-19 08:18:02 -07:00
Pablo Hoffman
97f362d64e removed deprecated/undocumented class: HTMLImageLinkExtractor 2012-04-19 12:07:46 -03:00
Artem Bogomyagkov
3a7c28f155 fixed FormRequest for a form missed values case 2012-04-19 18:07:38 +03:00
Pablo Hoffman
13ac6f63eb Merge pull request #119 from andrix/fix-htmlimagelinkextractor
Fix HTMLImageLinkExtractor to work with libxml2 and lxml selectors
2012-04-19 07:56:38 -07:00
Pablo Hoffman
1e521a3ff3 improved command line tool usage help which explains that more commands are available when run from project directory. refs #107 2012-04-19 02:54:57 -03:00
Pablo Hoffman
2fb5e62c39 doc: update overview page to point to the genspider command. refs #107 2012-04-19 02:37:22 -03:00
Pablo Hoffman
1c5294bee1 update docstring in project template to avoid confusion with genspider command, which may be considered as an advanced feature. refs #107 2012-04-19 02:35:48 -03:00
Pablo Hoffman
d567d8efbe added note to docs/topics/firebug.rst about google directory being shut down 2012-04-19 01:34:20 -03:00
Daniel Graña
21e03729a3 lxml is the new default selector backend. closes #120 2012-04-19 00:28:27 -03:00
Daniel Graña
6bb40fe5a8 use xpath to match img tags in one shot. #119 2012-04-19 00:03:04 -03:00
Andrés Moreira
e24107feb8 fix HTMLImageLinkExtractor to work with libxml2 and lxml selectors 2012-04-18 18:05:44 -03:00
Pablo Hoffman
30ddbf624e mention about some scrapy.xlib modules removed in the release notes 2012-04-17 12:31:18 -03:00
Pablo Hoffman
d99ee6deb9 added some missing entries to release notes 2012-04-17 12:29:48 -03:00
Daniel Graña
60ee5d6213 Merge branch 'lxml-formrequest' 2012-04-15 00:47:39 -03:00
Daniel Graña
5b0df465e7 SELECT matched as form inputs but hasnot type attribute. #111 2012-04-15 00:47:30 -03:00
Daniel Graña
150e5734d3 Merge pull request #111 from LucianU/lxml-formrequest
Lxml formrequest
2012-04-13 12:29:04 -07:00
Daniel Graña
39395eb4f7 iteritems returns tuple elements duh!. #111 2012-04-13 16:22:25 -03:00
Daniel Graña
b88bdd05c4 do not add clickable if it is in formdata. #111 2012-04-13 16:20:35 -03:00
Daniel Graña
51e5aadd1d simplify formdata type infering. #111 2012-04-13 16:07:27 -03:00
Daniel Graña
a11ef7fba7 reuse LxmlDocument in FormRequest. #111 2012-04-13 15:41:58 -03:00
Daniel Graña
9e10abcc43 Merge branch 'master' into lxml-formrequest 2012-04-13 14:50:49 -03:00
Daniel Graña
1789a55f28 Merge pull request #117 from dangra/lxml-document
Lxml document
2012-04-13 10:49:35 -07:00
Daniel Graña
4c7d29b7f7 Cache response's element trees using LxmlDocument similar to Libxml2Document 2012-04-13 14:43:30 -03:00
Daniel Graña
ac4f6cc17c unify libxml2 document and factories 2012-04-13 13:54:48 -03:00
Daniel Graña
2904dc2dc0 no need for MultipleElementsFound exception. #111 2012-04-13 13:28:16 -03:00
Daniel Graña
ee1f7847a4 more lxml form fixes and test cases. #111
* Do not treat "coord" attribute specially, just pass "NN,NN" as clickdata value
* Raise explicit ValueError if not clickable is found
* Fix bug looking for clickeables trough xpath when there is more than one form
* Test from_response with multiple clickdata
2012-04-13 13:09:21 -03:00
Daniel Graña
32b9f788be lxml form request cleanup. #111
* remove unused _nons function copied from lxml.html
* compute clickables only if dont_click is False
* less _get_clickables function branch nesting
2012-04-13 12:50:47 -03:00
Daniel Graña
e4d22cb16a reuse form_values() method from lxml to avoid copying code. #111 2012-04-13 10:31:59 -03:00
Daniel Graña
18a35a9fd6 Merge branch 'lxml-selectors' 2012-04-13 09:32:51 -03:00
Daniel Graña
3dbe211d29 lxml boolean results fix. oops #116 2012-04-13 09:29:56 -03:00
Daniel Graña
a338c29287 Merge pull request #116 from dangra/lxml-selectors
more fixes to lxml selector incompatibilities
2012-04-13 04:38:45 -07:00
Daniel Graña
b9efa5ee73 more fixes to lxml selector incompatibilities
* Do not fail parsing empty bodies
* Do not fail parsing bodies with null bytes
* Recode to utf8 using response.body_as_unicode() to avoid decoding bugs
* Return empty results with unevaluable nodes like text or attribute nodes
* Return u'1' and u'0' for boolean xpaths
2012-04-13 00:58:31 -03:00
Daniel Graña
d8ebf16fe5 Merge pull request #114 from stav/master
Scrapy DOC changes
2012-04-11 12:08:45 -07:00
Pablo Hoffman
7cca916ed5 added release notes to official documentation, including all release notes since Scrapy 0.7 2012-04-11 15:53:23 -03:00
Lucian Ursu
c760cc5cd8 Copied lxml.html._nons to not rely on that module's private interface and took out check out of the for loop because it can be done only once 2012-04-11 21:40:00 +03:00
Lucian Ursu
f13a547203 Removed unnecessary iteration of formdata items 2012-04-11 21:08:43 +03:00
stav
f1802289cd small doc typo change to get the fork rolling 2012-04-11 12:05:39 -05:00
Daniel Graña
02833e3265 fix typo in module description. closes #112 2012-04-11 10:44:31 -03:00