How do I get the whole content between two xml tags in Python?

后端 未结 5 1594
我寻月下人不归
我寻月下人不归 2020-12-15 09:13

I try to get the whole content between an opening xml tag and it\'s closing counterpart.

Getting the content in straight cases like title below is easy

5条回答
  •  春和景丽
    2020-12-15 09:34

    from lxml import etree
    t = etree.XML(
    """
    
      Some testing stuff
      Some text with data in it.
    """
    )
    (t.text + ''.join(map(etree.tostring, t))).strip()
    

    The trick here is that t is iterable, and when iterated, yields all child nodes. Because etree avoids text nodes, you also need to recover the text before the first child tag, with t.text.

    In [50]: (t.text + ''.join(map(etree.tostring, t))).strip()
    Out[50]: 'Some testing stuff\n  Some text with data in it.'
    

    Or:

    In [6]: e = t.xpath('//text')[0]
    
    In [7]: (e.text + ''.join(map(etree.tostring, e))).strip()
    Out[7]: 'Some text with data in it.'
    

提交回复
热议问题