Merging and sorting log files in Python

前端 未结 5 1044
佛祖请我去吃肉
佛祖请我去吃肉 2021-01-02 02:58

I am completely new to python and I have a serious problem which I cannot solve.

I have a few log files with identical structure:

[timestamp] [level]         


        
5条回答
  •  情话喂你
    2021-01-02 03:30

    First off, you will want to use the fileinput module for getting data from multiple files, like:

    data = fileinput.FileInput()
    for line in data.readlines():
        print line
    

    Which will then print all of the lines together. You also want to sort, which you can do with the sorted keyword.

    Assuming your lines had started with [2011-07-20 19:20:12], you're golden, as that format doesn't need any sorting above and beyond alphanum, so do:

    data = fileinput.FileInput()
    for line in sorted(data.readlines()):
        print line
    

    As, however, you have something more complex you need to do:

    def compareDates(line1, line2):
       # parse the date here into datetime objects
       NotImplemented
       # Then use those for the sorting
       return cmp(parseddate1, parseddate2)
    
    data = fileinput.FileInput()
    for line in sorted(data.readlines(), cmp=compareDates):
        print line
    

    For bonus points, you can even do

    data = fileinput.FileInput(openhook=fileinput.hook_compressed)
    

    which will enable you to read in gzipped log files.

    The usage would then be:

    $ python yourscript.py access.log.1 access.log.*.gz
    

    or similar.

提交回复
热议问题