I am writing a code to take an enormous textfile (several GB) N lines at a time, process that batch, and move onto the next N lines until I have completed the entire file.
Used chunker function from What is the most “pythonic” way to iterate over a list in chunks?:
from itertools import izip_longest
def grouper(iterable, n, fillvalue=None):
"grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return izip_longest(*args, fillvalue=fillvalue)
with open(filename) as f:
for lines in grouper(f, chunk_size, ""): #for every chunk_sized chunk
"""process lines like
lines[0], lines[1] , ... , lines[chunk_size-1]"""