How to grep lines between two patterns in a big file with python

前端 未结 5 1982
青春惊慌失措
青春惊慌失措 2021-01-03 12:45

I have a very big file, like this:

[PATTERN1]
line1
line2
line3 
...
...
[END PATTERN]
[PATTERN2]
line1 
line2
...
...
[END PATTERN]

I need to extract

5条回答
  •  既然无缘
    2021-01-03 13:09

    Use something like

    import re
    
    START_PATTERN = '^START-PATTERN$'
    END_PATTERN = '^END-PATTERN$'
    
    with open('myfile') as file:
        match = False
        newfile = None
    
        for line in file:
            if re.match(START_PATTERN, line):
                match = True
                newfile = open('my_new_file.txt', 'w')
                continue
            elif re.match(END_PATTERN, line):
                match = False
                newfile.close()
                continue
            elif match:
                newfile.write(line)
                newfile.write('\n')
    

    This will iterate the file without reading it all into memory. It also writes directly to your new file, rather than appending to a list in memory. If your source is large enough that too may become an issue.

    Obviously there are numerous modifications you may need to make to this code; perhaps a regex pattern is not required to match a start/end line, in which case replace it with something like if 'xyz' in line.

提交回复
热议问题