Cheap way to search a large text file for a string

后端 未结 9 1084
隐瞒了意图╮
隐瞒了意图╮ 2020-11-27 04:15

I need to search a pretty large text file for a particular string. Its a build log with about 5000 lines of text. Whats the best way to go about doing that? Using regex sho

9条回答
  •  南方客
    南方客 (楼主)
    2020-11-27 05:07

    This is entirely inspired by laurasia's answer above, but it refines the structure.

    It also adds some checks:

    • It will correctly return 0 when searching an empty file for the empty string. In laurasia's answer, this is an edge case that will return -1.
    • It also pre-checks whether the goal string is larger than the buffer size, and raises an error if this is the case.

    In practice, the goal string should be much smaller than the buffer for efficiency, and there are more efficient methods of searching if the size of the goal string is very close to the size of the buffer.

    def fnd(fname, goal, start=0, bsize=4096):
        if bsize < len(goal):
            raise ValueError("The buffer size must be larger than the string being searched for.")
        with open(fname, 'rb') as f:
            if start > 0:
                f.seek(start)
            overlap = len(goal) - 1
            while True:
                buffer = f.read(bsize)
                pos = buffer.find(goal)
                if pos >= 0:
                    return f.tell() - len(buffer) + pos
                if not buffer:
                    return -1
                f.seek(f.tell() - overlap)
    

提交回复
热议问题