How do I parse and write XML using Python's ElementTree without moving namespaces around?

后端 未结 1 568
-上瘾入骨i
-上瘾入骨i 2020-12-03 12:48

Our project gets from upstream XML of this form:



  
    

        
相关标签:
1条回答
  • 2020-12-03 13:02

    As far as I know the solution that better suits your needs is to write a pure Python custom rendering using the features exposed by xml.etree.ElementTree. Here is one possible solution:

    from xml.etree import ElementTree as ET
    from re import findall, sub
    
    def render(root, buffer='', namespaces=None, level=0, indent_size=2, encoding='utf-8'):
        buffer += f'<?xml version="1.0" encoding="{encoding}" ?>\n' if not level else ''
        root = root.getroot() if isinstance(root, ET.ElementTree) else root
        _, namespaces = ET._namespaces(root) if not level else (None, namespaces)
        for element in root.iter():
            indent = ' ' * indent_size * level
            tag = sub(r'({[^}]+}\s*)*', '', element.tag)
            buffer += f'{indent}<{tag}'
            for ns in findall(r'{[^}]+}', element.tag):
                ns_key = ns[1:-1]
                if ns_key not in namespaces: continue
                buffer += ' xmlns' + (f':{namespaces[ns_key]}' if namespaces[ns_key] != '' else '') + f'="{ns_key}"'
                del namespaces[ns_key]
            for k, v in element.attrib.items():
                buffer += f' {k}="{v}"'
            buffer += '>' + element.text.strip() if element.text else '>'
            children = list(element)
            for child in children:
                sep = '\n' if buffer[-1] != '\n' else ''
                buffer += sep + render(child, level=level+1, indent_size=indent_size, namespaces=namespaces)
            buffer += f'{indent}</{tag}>\n' if 0 != len(children) else f'</{tag}>\n'
        return buffer
    

    By issuing theXML data you gave, to the above render function as show below:

    data=\
    '''<?xml version="1.0" encoding="utf-8"?>
    <configuration>
      <runtime>
        <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
          <dependentAssembly>
            <assemblyIdentity name="Newtonsoft.Json" publicKeyToken="30ad4fe6b2a6aeed" culture="neutral" />
            <bindingRedirect oldVersion="0.0.0.0-6.0.0.0" newVersion="7.0.0.0" />
          </dependentAssembly>
        </assemblyBinding>
      </runtime>
      <appSettings>
        <add key="foo" value="default" />
      </appSettings>
    </configuration>'''
    
    e = ET.fromstring(data)
    ET.register_namespace('', "urn:schemas-microsoft-com:asm.v1")
    r = ET.ElementTree(e)
    

    You'll get the following resulting XML having the properties you stated you are looking for:

    <?xml version="1.0" encoding="utf-8" ?>
    <configuration>
      <runtime>
        <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
          <dependentAssembly>
            <assemblyIdentity name="Newtonsoft.Json" publicKeyToken="30ad4fe6b2a6aeed" culture="neutral"></assemblyIdentity>
            <bindingRedirect oldVersion="0.0.0.0-6.0.0.0" newVersion="7.0.0.0"></bindingRedirect>
          </dependentAssembly>
        </assemblyBinding>
      </runtime>
      <appSettings>
        <add key="foo" value="default"></add>
      </appSettings>
    </configuration>
    

    I know I came late to the party.. Anyway hoping this will help you and many other having the same issue, here it is a good solution. Happy coding!

    0 讨论(0)
提交回复
热议问题