Keep Existing Namespaces when overwriting XML file with ElementTree and Python

↘锁芯ラ 提交于 2019-11-28 11:30:32

As far as i know there isn't a way by the means of xml.etree.ElementTree methods to achieve your goal. By digging in the xml.etree source code and the xml specification I found that the library behaviour is not wrong, nor unreasonable. Anyway it does not allows the output you are looking for.

To achieve your goal using that library you have to customize rendering behaviour. To best suite your needs I have written the following render function.

from xml.etree import ElementTree as ET
from re import findall, sub

def render(root, buffer='', namespaces=None, level=0, indent_size=2, encoding='utf-8'):
    buffer += f'<?xml version="1.0" encoding="{encoding}" ?>\n' if not level else ''
    root = root.getroot() if isinstance(root, ET.ElementTree) else root
    _, namespaces = ET._namespaces(root) if not level else (None, namespaces)
    for element in root.iter():
        indent = ' ' * indent_size * level
        tag = sub(r'({[^}]+}\s*)*', '', element.tag)
        buffer += f'{indent}<{tag}'
        for ns in findall(r'{[^}]+}', element.tag):
            ns_key = ns[1:-1]
            if ns_key not in namespaces: continue
            buffer += ' xmlns' + (f':{namespaces[ns_key]}' if namespaces[ns_key] != '' else '') + f'="{ns_key}"'
            del namespaces[ns_key]
        for k, v in element.attrib.items():
            buffer += f' {k}="{v}"'
        buffer += '>' + element.text.strip() if element.text else '>'
        children = list(element)
        for child in children:
            sep = '\n' if buffer[-1] != '\n' else ''
            buffer += sep + render(child, level=level+1, indent_size=indent_size, namespaces=namespaces)
        buffer += f'{indent}</{tag}>\n' if 0 != len(children) else f'</{tag}>\n'
    return buffer

By supplying to the above render() function your xml input data as follows:

data =\ 
'''<?xml version="1.0" encoding="utf-8"?>
<foo>
   <bar>
      <bat>1</bat>
   </bar>
   <a>
      <b xmlns="urn:schemas-microsoft-com:asm.v1">
         <c>1</c>
      </b>
   </a>
</foo>'''

root = ET.ElementTree(ET.fromstring(data))
ET.register_namespace('', "urn:schemas-microsoft-com:asm.v1")
print(render(root))

It prints out the output your are looking for:

<?xml version="1.0" encoding="utf-8" ?>
<foo>
  <bar>
    <bat>1</bat>
  </bar>
  <a>
    <b xmlns="urn:schemas-microsoft-com:asm.v1">
      <c>1</c>
    </b>
  </a>
</foo>
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!