How to stream XML output quickly from Python

感情迁移 提交于 2019-12-07 04:56:43

问题


What is a quick way of writing an XML file iteratively (i.e. without having the whole document in memory)? xml.sax.saxutils.XMLGenerator works but is slow, around 1MB/s on an I7 machine. Here is a test case.


回答1:


I realize that this question has been asked awhile ago, but, in the mean time, an lxml API has been introduced that looks promising in terms of addressing the problem: http://lxml.de/api.html ; specifically, refer to the following section: "Incremental XML generation".

I quickly tested it by streaming a 10M file just as in your benchmark, and it took a fraction of a second on my old laptop, which is by no means very scientific, but is quite in the same ballpark as your generate_large_xml() function.




回答2:


As Yury V. Zaytsev mentioned, lxml realy provides API for generating XML documents in streaming manner

Here is working example:

from lxml import etree

fname = "streamed.xml"
with open(fname, "w") as f, etree.xmlfile(f) as xf:
    attribs = {"tag": "bagggg", "text": "att text", "published": "now"}
    with xf.element("root", attribs):
        xf.write("root text\n")
        for i in xrange(10):
            rec = etree.Element("record", id=str(i))
            rec.text = "record text data"
            xf.write(rec)

Resulting XML looks like this (the content reformatted from one-line XML doc):

<?xml version="1.0"?>
<root text="att text" tag="bagggg" published="now">root text
    <record id="0">record text data</record>
    <record id="1">record text data</record>
    <record id="2">record text data</record>
    <record id="3">record text data</record>
    <record id="4">record text data</record>
    <record id="5">record text data</record>
    <record id="6">record text data</record>
    <record id="7">record text data</record>
    <record id="8">record text data</record>
    <record id="9">record text data</record>
</root>


来源:https://stackoverflow.com/questions/19502503/how-to-stream-xml-output-quickly-from-python

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!