Extracting text after tag in Python's ElementTree

限于喜欢 提交于 2019-12-28 02:07:00

问题


Here is a part of XML:

<item><img src="cat.jpg" /> Picture of a cat</item>

Extracting the tag is easy. Just do:

et = xml.etree.ElementTree.fromstring(our_xml_string)
img = et.find('img')

But how to get the text immediately after it (Picture of a cat)? Doing the following returns a blank string:

print et.text

回答1:


Elements have a tail attribute -- so instead of element.text, you're asking for element.tail.

>>> import lxml.etree
>>> root = lxml.etree.fromstring('''<root><foo>bar</foo>baz</root>''')
>>> root[0]
<Element foo at 0x145a3c0>
>>> root[0].tail
'baz'

Or, for your example:

>>> et = lxml.etree.fromstring('''<item><img src="cat.jpg" /> Picture of a cat</item>''')
>>> et.find('img').tail
' Picture of a cat'

This also works with plain ElementTree:

>>> import xml.etree.ElementTree
>>> xml.etree.ElementTree.fromstring(
...   '''<item><img src="cat.jpg" /> Picture of a cat</item>'''
... ).find('img').tail
' Picture of a cat'


来源:https://stackoverflow.com/questions/9673906/extracting-text-after-tag-in-pythons-elementtree

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!