How can I prevent lxml from auto-closing empty elements when serializing to string?

故事扮演 提交于 2019-12-22 10:39:09

问题


I am parsing a huge xml file which contains many empty elements such as

<MemoryEnv></MemoryEnv>

When serializing with

etree.tostring(root_element, pretty_print_True)

the output element is collapsed to

<MemoryEnv/>

Is there any way to prevent this? the etree.tostring() does not provide such a facility.

Is there a way interfere with lxml's tostring() serializer?

Btw, the html module does not work. It's not designed for XML, and it does not create empty elements in their original form.

The problem is, that although collapsed and uncollapsed forms of an empty element are equivalent, the program that parses this file won't work with collapsed empty elements.


回答1:


Here is a way to do it. Ensure that the text value for all empty elements is not None.

Example:

from lxml import etree

XML = """
<root>
  <MemoryEnv></MemoryEnv>
  <AlsoEmpty></AlsoEmpty>
  <foo>bar</foo>
</root>"""

doc = etree.fromstring(XML)

for elem in doc.iter():
    if elem.text == None:
        elem.text = ''

print etree.tostring(doc)

Output:

<root>
  <MemoryEnv></MemoryEnv>
  <AlsoEmpty></AlsoEmpty>
  <foo>bar</foo>
</root>

An alternative is to use the write_c14n() method to write canonical XML (which does not use the special empty-element syntax) to a file.

from lxml import etree

XML = """
<root>
  <MemoryEnv></MemoryEnv>
  <AlsoEmpty></AlsoEmpty>
  <foo>bar</foo>
</root>"""

doc = etree.fromstring(XML)

doc.getroottree().write_c14n("out.xml")



回答2:


Using XML method (c14n) for printing and it works with lxml, it does not collapse empty elements.

>>> from lxml import etree
>>> s = "<MemoryEnv></MemoryEnv>"
>>> root_element = etree.XML(s)
>>> etree.tostring(root_element, method="c14n")
b'<MemoryEnv></MemoryEnv>'


来源:https://stackoverflow.com/questions/34111154/how-can-i-prevent-lxml-from-auto-closing-empty-elements-when-serializing-to-stri

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!