How to preserve namespaces when parsing xml via ElementTree in Python

喜你入骨 提交于 2020-01-03 02:57:04

问题


Assume that I've the following XML which I want to modify using Python's ElementTree:

<root xmlns:prefix="URI">
  <child company:name="***"/>
  ...
</root> 

I'm doing some modification on the XML file like this:

import xml.etree.ElementTree as ET
tree = ET.parse('filename.xml')
# XML modification here
# save the modifications
tree.write('filename.xml')

Then the XML file looks like:

<root xmlns:ns0="URI">
  <child ns0:name="***"/>
  ...
</root>

As you can see, the namepsace prefix changed to ns0. I'm aware of using ET.register_namespace() as mentioned here.

The problem with ET.register_namespace() is that:

  1. You need to know prefix and URI
  2. It can not be used with default namespace.

e.g. If the xml looks like:

<root xmlns="http://uri">
    <child name="name">
    ...
    </child>
</root>

It will be transfomed to something like:

<ns0:root xmlns:ns0="http://uri">
    <ns0:child name="name">
    ...
    </ns0:child>
</ns0:root>

As you can see, the default namespace is changed to ns0.

Is there any way to solve this problem with ElementTree?


回答1:


Here is the way to preserve the namespaces' prefix and URI:

def register_all_namespaces(filename):
    namespaces = dict([node for _, node in ET.iterparse(filename, events=['start-ns'])])
    for ns in namespaces:
        ET.register_namespace(ns, namespaces[ns])

This method should be called before calling the [ET].write() method.



来源:https://stackoverflow.com/questions/54439309/how-to-preserve-namespaces-when-parsing-xml-via-elementtree-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!