how to get the full contents of a node using xpath & lxml?

五迷三道 提交于 2019-12-05 16:44:35

I'm not sure I understand -- is this close to what you are looking for?

import lxml.etree as le
import cStringIO
content='''\
<font face="verdana" color="#ffffff" size="2"><a href="url">inside</a> something</font>
'''
doc=le.parse(cStringIO.StringIO(content))

xpath='//font[@face="verdana" and @color="#ffffff" and @size="2"]/child::*'
x=doc.xpath(xpath)
print(map(le.tostring,x))
# ['<a href="url">inside</a> something']

Is there anyway to use a pure XPath query to get the contents of the <font> nodes, or even to force lxml to return a string of the contents from the .xpath() method, rather than an lxml object?

Note that I'm returning a list of many nodes from the XPath query so the solution needs to support that.

just to clarify... i want to return something something <a href="url">inside</a> something from something like...

<font face="verdana" color="#ffffff" size="2"><a

href="url">inside something

Short answer: No.

XPath doesn't work on "tags" but with nodes

The selected nodes are represented as instances of specific objects in the language that is hosting XPath.

In case you need the string representation of a particular node's markup, such objects typically support an outerXML property -- check the documentation of the hosting language (lxml in this case).

As @Robert-Rossney pointed out in his comment: lxml's tostring() method is equivalent to other environments' outerXml property.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!