Python - How can I open a file and specify the offset in bytes?

前端 未结 8 1671
我在风中等你
我在风中等你 2020-12-17 14:37

I\'m writing a program that will parse an Apache log file periodically to log it\'s visitors, bandwidth usage, etc..

The problem is, I don\'t want to open the log an

相关标签:
8条回答
  • 2020-12-17 15:13

    Here is code proving using the length sugestion of yours and the tell methond:

    beginning="""line1
    line2
    line3"""
    
    end="""- The log will open from this point
    line4
    line5"""
    
    openfile= open('log.txt','w')
    openfile.write(beginning)
    endstarts=openfile.tell()
    openfile.close()
    
    open('log.txt','a').write(end)
    print open('log.txt').read()
    
    print("\nAgain:")
    end2 = open('log.txt','r')
    end2.seek(len(beginning))
    
    print end2.read()  ## wrong by two too little because of magic newlines in Windows
    end2.seek(endstarts)
    
    print "\nOk in Windows also"
    print end2.read()
    end2.close()
    
    0 讨论(0)
  • 2020-12-17 15:21

    You can manage the position in the file thanks to the seek and tell methods of the file class see https://docs.python.org/2/tutorial/inputoutput.html

    The tell method will tell you where to seek next time you open

    0 讨论(0)
  • 2020-12-17 15:25

    Here is an efficient and safe snippet to do that saving the offset read in a parallell file. Basically logtail in python.

    with open(filename) as log_fd:
        offset_filename = os.path.join(OFFSET_ROOT_DIR,filename)
        if not os.path.exists(offset_filename):
            os.makedirs(os.path.dirname(offset_filename))
            with open(offset_filename, 'w') as offset_fd:
                offset_fd.write(str(0))
        with open(offset_filename, 'r+') as offset_fd:
            log_fd.seek(int(offset_fd.readline()) or 0)
            new_logrows_handler(log_fd.readlines())
            offset_fd.seek(0)
            offset_fd.write(str(log_fd.tell()))
    
    0 讨论(0)
  • 2020-12-17 15:28

    Easy but not recommended :):

    last_line_processed = get_last_line_processed()    
    with open('file.log') as log
        for record_number, record in enumerate(log):
            if record_number >= last_line_processed:
                parse_log(record)
    
    0 讨论(0)
  • 2020-12-17 15:30

    If you're parsing your log line per line, you could juste save line number from the last parsing. You would juste have then to start read it from the good line the next time.

    Seeking is more usefull when you have to be in a very specific place in the file.

    0 讨论(0)
  • 2020-12-17 15:31

    Note that you can seek() in python from the end of the file:

    f.seek(-3, os.SEEK_END)
    

    puts the read position 3 lines from the EOF.

    However, why not use diff, either from the shell or with difflib?

    0 讨论(0)
提交回复
热议问题