lxml: write an incremental pretty print xml

狂风中的少年 提交于 2020-06-16 05:18:51

问题


I'm working with very large XML files (>1GB) and require a method to write them incrementally.

There's a single top level element and thousands of large 2nd level elements (each has it's own multi level hierarchy). I tried this:

from lxml import etree
with etree.xmlfile(out_file_name, encoding = 'UTF-8') as xf:
xf.write_declaration()

  with xf.element('top'):
  xf.write('\n')

  # parse individual input files and write the 2nd level element to the output 
  for file_name in file_list:
    context = etree.iterparse(file_name, tag='my_2nd_level_tag', remove_blank_text = True)
    for _, elem in context:
      xf.write(elem, pretty_print=True))

The result is that the 'top' element has the same (zero) indentation as the 2nd level element.

I'm looking for a clean way to use lxml's incremental XML writes to produce a fully indented XML.

来源:https://stackoverflow.com/questions/56000781/lxml-write-an-incremental-pretty-print-xml

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!