I am reading an 800 GB xml file in python 2.7 and parsing it with an etree iterative parser.
Currently, I am just using open(\'foo.txt\')
with no buffer
Have you tried a lazy function?: Lazy Method for Reading Big File in Python?
this seems to already answer your question. However, I would consider using this method to write your data to a DATABASE, mysql is free: http://dev.mysql.com/downloads/ , NoSQL is also free and might be a little more tailored to operations involving writing 800gb of data, or similar amounts: http://www.oracle.com/technetwork/database/nosqldb/downloads/default-495311.html