问题
There are 2 log files : log A and log B.
log A
2015-07-12 08:50:33,904 [Collection-3]INFO app -Executing Scheduled job: System: choppa1
2015-07-12 09:56:45,060 [Collection-3] INFO app - Executing Scheduled job: System: choppa1
2015-07-12 10:00:00,001 [Analytics_Worker-1] INFO app - Trigger for job AnBuildAuthorizationJob was fired.
2015-07-12 11:00:00,007 [Analytics_Worker-1] INFO app - Starting the AnBuildAuthorizationJob job.
log B
2014-07-12 09:50:33,904 [Collection-3] INFO app - Executing Scheduled job: System: choppa1
2014-07-12 09:56:45,060 [Collection-3] INFO app - Executing Scheduled job: System: choppa1
2014-07-12 10:00:00,001 [Analytics_Worker-1] INFO app - Trigger for job AnBuildAuthorizationJob was fired.
2014-07-12 10:00:00,007 [Analytics_Worker-1] INFO app - Starting the AnBuildAuthorizationJob job.
The 2 log files have same content but the timestamp is different. I need to compare the 2 files by ignoring timestamp i.e. compare each line of both the files and even though they have different timestamp, it shouldn't report any difference. I wrote the following python script for this:
#!/usr/bin/python
import re
import difflib
program = open("log1.txt", "r")
program_contents = program.readlines()
program.close()
new_contents = []
pat = re.compile("^[^0-9]")
for line in program_contents:
if re.search(pat, line):
new_contents.append(line)
program = open("log2.txt", "r")
program_contents1 = program.readlines()
program.close()
new_contents1 = []
pat = re.compile("^[^0-9]")
for line in program_contents1:
if re.search(pat, line):
new_contents1.append(line)
diff=difflib.ndiff(new_contents,new_contents1)
print(''.join(diff))
Is there more efficient way of writing the above script?? And also the above script works only if timestamp is in the beginning of the line. I want to write a python script that should work even if timestamp is somewhere in the middle of the line. Can anyone please help me how to do this?
回答1:
I would change pat = re.compile("^[^0-9]")
to pat = re.compile("\d{4}-d{2}-d{2}
and also it is better to open files
with open(filename) as f:
this way python will close file for you, no need for close(f) statement.
回答2:
Here is the small script to eliminate timestamp from the beginning of the file.
program = open("log1.txt", "r")
program_contents = program.readlines()
program.close()
program = open("log2.txt", "r")
program_contents1 = program.readlines()
program.close()
for i in range(0,len(program_contents1)):
if program_contents[i] == '\n':
continue
if program_contents[i][19:] == program_contents1[i][19:]:
print("Matches")
来源:https://stackoverflow.com/questions/32863329/pythonic-script-that-ignores-timestamps-in-log-files