How to go through blocks of lines separated by an empty line? The file looks like the following:
ID: 1
Name: X
FamilyN: Y
Age: 20
ID: 2
Name: H
FamilyN: F
A
If your file is too large to read into memory all at once, you can still use a regular expressions based solution by using a memory mapped file, with the mmap module:
import sys
import re
import os
import mmap
block_expr = re.compile('ID:.*?\nAge: \d+', re.DOTALL)
filepath = sys.argv[1]
fp = open(filepath)
contents = mmap.mmap(fp.fileno(), os.stat(filepath).st_size, access=mmap.ACCESS_READ)
for block_match in block_expr.finditer(contents):
print block_match.group()
The mmap trick will provide a "pretend string" to make regular expressions work on the file without having to read it all into one large string. And the find_iter() method of the regular expression object will yield matches without creating an entire list of all matches at once (which findall() does).
I do think this solution is overkill for this use case however (still: it's a nice trick to know...)