I try to get the whole content between an opening xml tag and it\'s closing counterpart.
Getting the content in straight cases like title
below is easy
from lxml import etree
t = etree.XML(
"""
Some testing stuff
Some text with data in it.
"""
)
(t.text + ''.join(map(etree.tostring, t))).strip()
The trick here is that t
is iterable, and when iterated, yields all child nodes. Because etree avoids text nodes, you also need to recover the text before the first child tag, with t.text
.
In [50]: (t.text + ''.join(map(etree.tostring, t))).strip()
Out[50]: 'Some testing stuff \n Some text with data in it. '
Or:
In [6]: e = t.xpath('//text')[0]
In [7]: (e.text + ''.join(map(etree.tostring, e))).strip()
Out[7]: 'Some text with data in it.'