Python module xml.etree.ElementTree modifies xml namespace keys automatically

萝らか妹 提交于 2020-03-01 07:44:41

问题


I've noticed that python ElementTree module, changes the xml data in the following simple example :

import xml.etree.ElementTree as ET
tree = ET.parse("./input.xml")
tree.write("./output.xml")

I wouldn't expect it to change, as I've done simple read and write test without any modification. however, the results shows a different story, especially in the namespace indices (nonage --> ns0 , d3p1 --> ns1 , i --> ns2 ) :

input.xml:

<?xml version="1.0" encoding="utf-8"?>
<ServerData xmlns:i="http://www.a.org" xmlns="http://schemas.xxx/2004/07/Server.Facades.ImportExport">
<CreationDate>0001-01-01T00:00:00</CreationDate>
<Processes>
    <Processes xmlns:d3p1="http://schemas.datacontract.org/2004/07/Management.Interfaces">
        <d3p1:ProtectedProcess>
            <d3p1:Description>/Applications/Safari.app/Contents/MacOS/Safari</d3p1:Description>
            <d3p1:DiscoveredMachine i:nil="true" />
            <d3p1:Id>0</d3p1:Id>
            <d3p1:Name>/applications/safari.app/contents/macos/safari</d3p1:Name>
            <d3p1:Path>/Applications/Safari.app/Contents/MacOS/Safari</d3p1:Path>
            <d3p1:ProcessHashes xmlns:d5p1="http://schemas.datacontract.org/2004/07/Management.Interfaces.WildFire" />
            <d3p1:Status>1</d3p1:Status>
            <d3p1:Type>Protected</d3p1:Type>
        </d3p1:ProtectedProcess>
    </Processes>
</Processes>

and output.xml:

<ns0:ServerData xmlns:ns0="http://schemas.xxx/2004/07/Server.Facades.ImportExport" xmlns:ns1="http://schemas.datacontract.org/2004/07/Management.Interfaces" xmlns:ns2="http://www.a.org">
<ns0:CreationDate>0001-01-01T00:00:00</ns0:CreationDate>
<ns0:Processes>
    <ns0:Processes>
        <ns1:ProtectedProcess>
            <ns1:Description>/Applications/Safari.app/Contents/MacOS/Safari</ns1:Description>
            <ns1:DiscoveredMachine ns2:nil="true" />
            <ns1:Id>0</ns1:Id>
            <ns1:Name>/applications/safari.app/contents/macos/safari</ns1:Name>
            <ns1:Path>/Applications/Safari.app/Contents/MacOS/Safari</ns1:Path>
            <ns1:ProcessHashes />
            <ns1:Status>1</ns1:Status>
            <ns1:Type>Protected</ns1:Type>
        </ns1:ProtectedProcess>
    </ns0:Processes>
</ns0:Processes>


回答1:


You would need to register the namespaces for your xml as well as their prefixes with ElementTree before reading/writing the xml using ElementTree.register_namespace function. Example -

import xml.etree.ElementTree as ET

ET.register_namespace('','http://schemas.xxx/2004/07/Server.Facades.ImportExport')
ET.register_namespace('i','http://www.a.org')
ET.register_namespace('d3p1','http://schemas.datacontract.org/2004/07/Management.Interfaces')

tree = ET.parse("./input.xml")
tree.write("./output.xml")

Without this ElementTree creates its own prefixes for the corresponding namespaces, which is what happens for your case.

This is given in the documentation -

xml.etree.ElementTree.register_namespace(prefix, uri)

Registers a namespace prefix. The registry is global, and any existing mapping for either the given prefix or the namespace URI will be removed. prefix is a namespace prefix. uri is a namespace uri. Tags and attributes in this namespace will be serialized with the given prefix, if at all possible.

(Emphasis mine)



来源:https://stackoverflow.com/questions/33258826/python-module-xml-etree-elementtree-modifies-xml-namespace-keys-automatically

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!