发表新帖

发表新帖

How do I get the whole content between two xml tags in Python?

后端未结

关注

 5  1608

我寻月下人不归 2020-12-15 09:13

I try to get the whole content between an opening xml tag and it\'s closing counterpart.

Getting the content in straight cases like title below is easy

5条回答

盖世英雄少女心 (楼主)

2020-12-15 09:26
Here's something that works for me and your sample:
```
from lxml import etree
doc = etree.XML(
"""

  Some testing stuff
  Some text with data in it.
"""
)

def flatten(seq):
  r = []
  for item in seq:
    if isinstance(item,(str,unicode)):
      r.append(unicode(item))
    elif isinstance(item,(etree._Element,)):
      r.append(etree.tostring(item,with_tail=False))
  return u"".join(r)

print flatten(doc.xpath('/review/text/node()'))
```
Yields:
```
Some text with data in it.
```
The xpath selects all child nodes of the element and either renders them to unicode directly if they are a string/unicode subclass () or calls etree.tostring on it if it's an Element, with_tail=False avoids duplication of the tail.

You may need to handle other node types if they are present.
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...

热议问题