Merging Lots of XML files

北城以北 提交于 2019-12-02 07:27:41

Consider using XSLT and its document() function to merge XML files. Python (like many object-oriented programming languages) maintain an XSLT processor like in its lxml module. As information, XSLT is a declarative programming language to transform XML files in various formats and structures.

For your purposes, XSLT may be more efficient than using programming code to develop files as no lists or loops or other objects are held in memory during processing except what the XSLT processor would use.

XSLT (to be saved externally as .xsl file)

Consider initially running a Python write to text file looping to fill in all 365 documents to avoid copy and paste. Also notice first document is skipped since it is the starting point used in Python script below:

<?xml version="1.0" ?> 
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> 

 <xsl:template match="DATA">
  <DATA>
    <xsl:copy> 
       <xsl:copy-of select="TALLYMESSAGE"/>
       <xsl:copy-of select="document('Document2.xml')/BODY/DATA/TALLYMESSAGE"/>
       <xsl:copy-of select="document('Document3.xml')/BODY/DATA/TALLYMESSAGE"/>
       <xsl:copy-of select="document('Document4.xml')/BODY/DATA/TALLYMESSAGE"/>
       ...
       <xsl:copy-of select="document('Document365.xml')/BODY/DATA/TALLYMESSAGE"/>             
    </xsl:copy>
  </DATA>
 </xsl:template> 

</xsl:transform>

Python (to be included in you overall script)

import lxml.etree as ET

dom = ET.parse('C:\Path\To\XML\Document1.xml')
xslt = ET.parse('C:\Path\To\XSL\file.xsl')
transform = ET.XSLT(xslt)
newdom = transform(dom)

tree_out = ET.tostring(newdom, encoding='UTF-8', pretty_print=True,  xml_declaration=True)
print(tree_out)

xmlfile = open('C:\Path\To\XML\OutputFile.xml','wb')
xmlfile.write(tree_out)
xmlfile.close()
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!