merging xml files using python's ElementTree

后端 未结 2 810
暖寄归人
暖寄归人 2020-12-09 12:41

I need to merge two xml files on the third block of the xml. So, files A.xml and B.xml look like this:

A.xml




        
相关标签:
2条回答
  • 2020-12-09 13:13

    Although this is mostly a duplicate and the answer can be found here, I already did this so i can share this python code:

    import os, os.path, sys
    import glob
    from xml.etree import ElementTree
    
    def run(files):
        xml_files = glob.glob(files +"/*.xml")
        xml_element_tree = None
        for xml_file in xml_files:
            data = ElementTree.parse(xml_file).getroot()
            # print ElementTree.tostring(data)
            for result in data.iter('results'):
                if xml_element_tree is None:
                    xml_element_tree = data 
                    insertion_point = xml_element_tree.findall("./results")[0]
                else:
                    insertion_point.extend(result) 
        if xml_element_tree is not None:
            print ElementTree.tostring(xml_element_tree)
    

    However this question contains another problem not present in the other post. The sample XML files are not valid XML so its not possible to have a XML tag with:

    <sample="1">
        ...
    </sample>
    

    is not possible change to something like:

    <sample id="1">
        ...
    </sample>
    
    0 讨论(0)
  • 2020-12-09 13:13

    You could try this solution:

    import glob
    from xml.etree import ElementTree
    
    def newRunRun(folder):
        xml_files = glob.glob(folder+"/*.xml")
        node = None
        for xmlFile in xml_files:      
            tree = ElementTree.parse(xmlFile)
            root = tree.getroot()
            if node is None:
                node = root
            else:
                elements = root.find("./results")           
                for element in elements._children:
                    node[1].append(element)                
        print ElementTree.tostring(node)
    
    folder = "resources"
    newRunRun(folder) 
    

    As you can see, I´m using the first doc as a container, inserting inside it the elements of others docs... This is the ouput generated:

    <sample id="1">
    <workflow value="x" version="1" />
      <results>
       <result type="Q">
          <result_data type="value" value="11" />
          <result_data type="value" value="21" />
          <result_data type="value" value="13" />
          <result_data type="value" value="12" />
          <result_data type="value" value="15" />
        </result>
      <result type="T">
          <result_data type="value" value="19" />
          <result_data type="value" value="15" />
          <result_data type="value" value="14" />
          <result_data type="value" value="13" />
          <result_data type="value" value="12" />
        </result>
      </results>
    </sample>
    

    Using the version: Python 2.7.15

    0 讨论(0)
提交回复
热议问题