Get last n lines of a file, similar to tail

前端 未结 30 2931
挽巷
挽巷 2020-11-22 03:46

I\'m writing a log file viewer for a web application and for that I want to paginate through the lines of the log file. The items in the file are line based with the newest

30条回答
  •  天涯浪人
    2020-11-22 04:06

    For efficiency with very large files (common in logfile situations where you may want to use tail), you generally want to avoid reading the whole file (even if you do do it without reading the whole file into memory at once) However, you do need to somehow work out the offset in lines rather than characters. One possibility is reading backwards with seek() char by char, but this is very slow. Instead, its better to process in larger blocks.

    I've a utility function I wrote a while ago to read files backwards that can be used here.

    import os, itertools
    
    def rblocks(f, blocksize=4096):
        """Read file as series of blocks from end of file to start.
    
        The data itself is in normal order, only the order of the blocks is reversed.
        ie. "hello world" -> ["ld","wor", "lo ", "hel"]
        Note that the file must be opened in binary mode.
        """
        if 'b' not in f.mode.lower():
            raise Exception("File must be opened using binary mode.")
        size = os.stat(f.name).st_size
        fullblocks, lastblock = divmod(size, blocksize)
    
        # The first(end of file) block will be short, since this leaves 
        # the rest aligned on a blocksize boundary.  This may be more 
        # efficient than having the last (first in file) block be short
        f.seek(-lastblock,2)
        yield f.read(lastblock)
    
        for i in range(fullblocks-1,-1, -1):
            f.seek(i * blocksize)
            yield f.read(blocksize)
    
    def tail(f, nlines):
        buf = ''
        result = []
        for block in rblocks(f):
            buf = block + buf
            lines = buf.splitlines()
    
            # Return all lines except the first (since may be partial)
            if lines:
                result.extend(lines[1:]) # First line may not be complete
                if(len(result) >= nlines):
                    return result[-nlines:]
    
                buf = lines[0]
    
        return ([buf]+result)[-nlines:]
    
    
    f=open('file_to_tail.txt','rb')
    for line in tail(f, 20):
        print line
    

    [Edit] Added more specific version (avoids need to reverse twice)

提交回复
热议问题