How do I read a random line from one file?

前端 未结 11 761
灰色年华
灰色年华 2020-12-04 20:03

Is there a built-in method to do it? If not how can I do this without costing too much overhead?

11条回答
  •  轻奢々
    轻奢々 (楼主)
    2020-12-04 20:07

    If you don't want to load the whole file into RAM with f.read() or f.readlines(), you can get random line this way:

    import os
    import random
    
    
    def get_random_line(filepath: str) -> str:
        file_size = os.path.getsize(filepath)
        with open(filepath, 'rb') as f:
            while True:
                pos = random.randint(0, file_size)
                if not pos:  # the first line is chosen
                    return f.readline().decode()  # return str
                f.seek(pos)  # seek to random position
                f.readline()  # skip possibly incomplete line
                line = f.readline()  # read next (full) line
                if line:
                    return line.decode()  
                # else: line is empty -> EOF -> try another position in next iteration
    
    

    P.S.: yes, that was proposed by Ignacio Vazquez-Abrams in his answer above, but a) there's no code in his answer and b) I've come up with this implementation myself; it can return first or last line. Hope it may be useful for someone.

    However, if you care about distribution, this code is not an option for you.

提交回复
热议问题