问题
I'm working with very large XML files (>1GB) and require a method to write them incrementally.
There's a single top level element and thousands of large 2nd level elements (each has it's own multi level hierarchy). I tried this:
from lxml import etree
with etree.xmlfile(out_file_name, encoding = 'UTF-8') as xf:
xf.write_declaration()
with xf.element('top'):
xf.write('\n')
# parse individual input files and write the 2nd level element to the output
for file_name in file_list:
context = etree.iterparse(file_name, tag='my_2nd_level_tag', remove_blank_text = True)
for _, elem in context:
xf.write(elem, pretty_print=True))
The result is that the 'top' element has the same (zero) indentation as the 2nd level element.
I'm looking for a clean way to use lxml's incremental XML writes to produce a fully indented XML.
来源:https://stackoverflow.com/questions/56000781/lxml-write-an-incremental-pretty-print-xml