Editing XML as a dictionary in python?

前端 未结 8 1174
深忆病人
深忆病人 2020-12-31 18:11

I\'m trying to generate customized xml files from a template xml file in python.

Conceptually, I want to read in the template xml, remove some elements, change some

8条回答
  •  一生所求
    2020-12-31 18:31

    My modification of Daniel's answer, to give a marginally neater dictionary:

    def xml_to_dictionary(element):
        l = len(namespace)
        dictionary={}
        tag = element.tag[l:]
        if element.text:
            if (element.text == ' '):
                dictionary[tag] = {}
            else:
                dictionary[tag] = element.text
        children = element.getchildren()
        if children:
            subdictionary = {}
            for child in children:
                for k,v in xml_to_dictionary(child).items():
                    if k in subdictionary:
                        if ( isinstance(subdictionary[k], list)):
                            subdictionary[k].append(v)
                        else:
                            subdictionary[k] = [subdictionary[k], v]
                    else:
                        subdictionary[k] = v
            if (dictionary[tag] == {}):
                dictionary[tag] = subdictionary
            else:
                dictionary[tag] = [dictionary[tag], subdictionary]
        if element.attrib:
            attribs = {}
            for k,v in element.attrib.items():
                attribs[k] = v
            if (dictionary[tag] == {}):
                dictionary[tag] = attribs
            else:
                dictionary[tag] = [dictionary[tag], attribs]
        return dictionary
    

    namespace is the xmlns string, including braces, that ElementTree prepends to all tags, so here I've cleared it as there is one namespace for the entire document

    NB that I adjusted the raw xml too, so that 'empty' tags would produce at most a ' ' text property in the ElementTree representation

    spacepattern = re.compile(r'\s+')
    mydictionary = xml_to_dictionary(ElementTree.XML(spacepattern.sub(' ', content)))
    

    would give for instance

    {'note': {'to': 'Tove',
             'from': 'Jani',
             'heading': 'Reminder',
             'body': "Don't forget me this weekend!"}}
    

    it's designed for specific xml that is basically equivalent to json, should handle element attributes such as

    elementContent
    

    too

    there's the possibility of merging the attribute dictionary / subtag dictionary similarly to how repeat subtags are merged, although nesting the lists seems kind of appropriate :-)

提交回复
热议问题