Parsing a large .bz2 file (40 GB) with lxml iterparse in python. Error that does not appear with uncompressed file
问题 I am trying to parse OpenStreetMap's planet.osm, compressed in bz2 format. Because it is already 41G, I don't want to decompress the file completely. So I figured out how to parse portions of the planet.osm file using bz2 and lxml, using the following code from lxml import etree as et from bz2 import BZ2File path = "where/my/fileis.osm.bz2" with BZ2File(path) as xml_file: parser = et.iterparse(xml_file, events=('end',)) for events, elem in parser: if elem.tag == "tag": continue if elem.tag ==