Have:
f = open(...)
r = re.compile(...)
Need:
Find the position (start and end) of a first matching regexp in a big file?
(star
The following code works reasonably well with test files around 2GB in size.
def search_file(pattern, filename, offset=0):
with open(filename) as f:
f.seek(offset)
for line in f:
m = pattern.search(line)
if m:
search_offset = f.tell() - len(line) - 1
return search_offset + m.start(), search_offset + m.end()
Note that the regular expression must not span multiple lines.