I am trying to parse a large fasta file and I am encountering out of memory errors. Some suggestions to improve the data handling would be appreciated. Currently the program
def read_fasta(filename):
name = None
with open(filename) as file:
for line in file:
if line[0] == ">":
if name:
yield (name, seq)
name = line[1:-1].split("|")[0]
seq = ""
else:
seq += line[:-1]
yield (name, seq)