问题
I've noticed that python ElementTree module, changes the xml data in the following simple example :
import xml.etree.ElementTree as ET
tree = ET.parse("./input.xml")
tree.write("./output.xml")
I wouldn't expect it to change, as I've done simple read and write test without any modification. however, the results shows a different story, especially in the namespace indices (nonage --> ns0 , d3p1 --> ns1 , i --> ns2 ) :
input.xml:
<?xml version="1.0" encoding="utf-8"?>
<ServerData xmlns:i="http://www.a.org" xmlns="http://schemas.xxx/2004/07/Server.Facades.ImportExport">
<CreationDate>0001-01-01T00:00:00</CreationDate>
<Processes>
<Processes xmlns:d3p1="http://schemas.datacontract.org/2004/07/Management.Interfaces">
<d3p1:ProtectedProcess>
<d3p1:Description>/Applications/Safari.app/Contents/MacOS/Safari</d3p1:Description>
<d3p1:DiscoveredMachine i:nil="true" />
<d3p1:Id>0</d3p1:Id>
<d3p1:Name>/applications/safari.app/contents/macos/safari</d3p1:Name>
<d3p1:Path>/Applications/Safari.app/Contents/MacOS/Safari</d3p1:Path>
<d3p1:ProcessHashes xmlns:d5p1="http://schemas.datacontract.org/2004/07/Management.Interfaces.WildFire" />
<d3p1:Status>1</d3p1:Status>
<d3p1:Type>Protected</d3p1:Type>
</d3p1:ProtectedProcess>
</Processes>
</Processes>
and output.xml:
<ns0:ServerData xmlns:ns0="http://schemas.xxx/2004/07/Server.Facades.ImportExport" xmlns:ns1="http://schemas.datacontract.org/2004/07/Management.Interfaces" xmlns:ns2="http://www.a.org">
<ns0:CreationDate>0001-01-01T00:00:00</ns0:CreationDate>
<ns0:Processes>
<ns0:Processes>
<ns1:ProtectedProcess>
<ns1:Description>/Applications/Safari.app/Contents/MacOS/Safari</ns1:Description>
<ns1:DiscoveredMachine ns2:nil="true" />
<ns1:Id>0</ns1:Id>
<ns1:Name>/applications/safari.app/contents/macos/safari</ns1:Name>
<ns1:Path>/Applications/Safari.app/Contents/MacOS/Safari</ns1:Path>
<ns1:ProcessHashes />
<ns1:Status>1</ns1:Status>
<ns1:Type>Protected</ns1:Type>
</ns1:ProtectedProcess>
</ns0:Processes>
</ns0:Processes>
回答1:
You would need to register the namespaces for your xml as well as their prefixes with ElementTree
before reading/writing the xml using ElementTree.register_namespace function. Example -
import xml.etree.ElementTree as ET
ET.register_namespace('','http://schemas.xxx/2004/07/Server.Facades.ImportExport')
ET.register_namespace('i','http://www.a.org')
ET.register_namespace('d3p1','http://schemas.datacontract.org/2004/07/Management.Interfaces')
tree = ET.parse("./input.xml")
tree.write("./output.xml")
Without this ElementTree creates its own prefixes for the corresponding namespaces, which is what happens for your case.
This is given in the documentation -
xml.etree.ElementTree.register_namespace(prefix, uri)
Registers a namespace prefix. The registry is global, and any existing mapping for either the given prefix or the namespace URI will be removed. prefix is a namespace prefix. uri is a namespace uri. Tags and attributes in this namespace will be serialized with the given prefix, if at all possible.
(Emphasis mine)
来源:https://stackoverflow.com/questions/33258826/python-module-xml-etree-elementtree-modifies-xml-namespace-keys-automatically