Pythonic script that ignores timestamps in log files

时光总嘲笑我的痴心妄想 提交于 2021-01-28 22:50:51

问题


There are 2 log files : log A and log B.

log A

2015-07-12 08:50:33,904 [Collection-3]INFO app -Executing Scheduled job: System: choppa1

2015-07-12 09:56:45,060 [Collection-3] INFO app - Executing Scheduled job: System: choppa1

2015-07-12 10:00:00,001 [Analytics_Worker-1] INFO  app  - Trigger for job AnBuildAuthorizationJob was fired.

2015-07-12 11:00:00,007 [Analytics_Worker-1] INFO app - Starting the AnBuildAuthorizationJob job.



log B

2014-07-12 09:50:33,904 [Collection-3] INFO  app  - Executing Scheduled job: System: choppa1

2014-07-12 09:56:45,060 [Collection-3] INFO  app  - Executing Scheduled job: System: choppa1

2014-07-12 10:00:00,001 [Analytics_Worker-1] INFO  app  - Trigger for job AnBuildAuthorizationJob was fired.

2014-07-12 10:00:00,007 [Analytics_Worker-1] INFO  app  - Starting the AnBuildAuthorizationJob job.

The 2 log files have same content but the timestamp is different. I need to compare the 2 files by ignoring timestamp i.e. compare each line of both the files and even though they have different timestamp, it shouldn't report any difference. I wrote the following python script for this:

#!/usr/bin/python
import re
import difflib

program = open("log1.txt", "r")
program_contents = program.readlines()
program.close() 

new_contents = []

pat = re.compile("^[^0-9]")

for line in program_contents:
 if re.search(pat, line):
  new_contents.append(line)

program = open("log2.txt", "r")
program_contents1 = program.readlines()
program.close() 

new_contents1 = []

pat = re.compile("^[^0-9]")

for line in program_contents1:
 if re.search(pat, line):
  new_contents1.append(line)

diff=difflib.ndiff(new_contents,new_contents1)
print(''.join(diff))

Is there more efficient way of writing the above script?? And also the above script works only if timestamp is in the beginning of the line. I want to write a python script that should work even if timestamp is somewhere in the middle of the line. Can anyone please help me how to do this?


回答1:


I would  change pat = re.compile("^[^0-9]")

             to pat = re.compile("\d{4}-d{2}-d{2}

and also it is better to open files

                  with open(filename) as f:

this way python will close file for you, no need for close(f) statement.




回答2:


Here is the small script to eliminate timestamp from the beginning of the file.

program = open("log1.txt", "r")
program_contents = program.readlines()
program.close()

program = open("log2.txt", "r")
program_contents1 = program.readlines()
program.close() 

for i in range(0,len(program_contents1)):
    if program_contents[i] == '\n':
        continue
    if program_contents[i][19:] == program_contents1[i][19:]:
        print("Matches")


来源:https://stackoverflow.com/questions/32863329/pythonic-script-that-ignores-timestamps-in-log-files

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!