问题
I have this code:
<table cellspacing="1" cellpadding="1" border="0">
<tbody>
<tr>
<td>Something else</td>
</tr>
<tr>
<td valign="top">
<a href="http://exact url">Something</a>
</td>
<td valign="top">Something else</td>
</tr>
</tbody>
</table>
I want to find the Table but is very hard to target it (the very same code is used like 10 times). But I know what is in the URL. How can I get then the parent table?
回答1:
If t
is the etree
for this snippet of XML, then the link you're looking for is
t.xpath('//a[@href = "http://exact url"]')[0]
From there, you can get to the table
using the ancestor
axis:
t.xpath('//a[@href = "http://exact url"]/ancestor::table')[-1]
回答2:
Filter tables by using []. Note that the attribute is a grandchild //table[.//@href="blah"]
Or //a[@href="blah"]//ancestor::table
回答3:
A pure XPath solution.
Use:
(//a[@href = "http://exact url"])[1]/ancestor::table[1]
This selects the first ancestor table
of the first a
element in the XML document, the string value of whose href
attribute is the string "http://exact url"
.
This provides the correct table
element even in case when there are nested tables each of which has the wanted a
element as descendant. In this case the above XPath expression selects the innermost such table
-- in contrast with the currently accepted answer, that obtains the outermost table
ancestor.
回答4:
//a[@href="http://exact url"]/../../..
You'll need 3 ..
s to reach the table element.
来源:https://stackoverflow.com/questions/9471419/how-to-select-parent-based-on-the-child-in-lxml