How to prevent xml.ElementTree fromstring from dropping commentnode

不羁的心 提交于 2019-11-26 20:58:10

问题


I have tho following code fragment:

    from xml.etree.ElementTree import fromstring,tostring
    mathml = fromstring(input)
    for elem in mathml.getiterator():
        elem.tag = 'm:' + elem.tag
    return tostring(mathml)

When i input the following input:

<math>
  <a> 1 2 3 </a>  <b />
<foo>Uitleg</foo>
<!-- <bar> -->
</math>

It results in:

<m:math>
  <m:a> 1 2 3 </m:a>  <m:b />
<m:foo>Uitleg</m:foo>

</m:math>

How come? And how can I preserve the comment?

edit: I don't care for the exact xml library used, however, I should be able to do the pasted change to the tags. Unfortunately, lxml does not seem to allow this (and I cannot use proper namespace operations)


回答1:


You cannot with xml.etree, because its parser ignores comments (which is acceptable behaviour for an xml parser by the way). But you can if you use the (compatible) lxml library, which allows you to configure parser options.

from lxml import etree

parser = etree.XMLParser(remove_comments=False)
tree = etree.parse('input.xml', parser=parser)
# or alternatively set the parser as default:
# etree.set_default_parser(parser)

This would by far be the easiest option. If you really have to use xml.etree, you could try hooking up your own parser, although even then, comments are not officially supported: have a look at this example (from the author of xml.etree) (still seems to work in python 2.7 by the way)



来源:https://stackoverflow.com/questions/5409161/how-to-prevent-xml-elementtree-fromstring-from-dropping-commentnode

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!