I need to diff two log files but ignore the time stamp part of each line (the first 12 characters to be exact). Is there a good tool, or a clever awk command, that could help m
Answers using cut
are fine but sometimes keeping timestamps within the diff
output is appreciable. As the OP's question is about ignoring the time stamps (not removing them), I share here my tricky command line:
diff -I '^#' <(sed -r 's/^((.){12})/#\1\n/' 1.log) <(sed -r 's/^((.){12})/#\1\n/' 2.log)
sed
isolates the timestamps (#
before and \n
after) within a process substitutiondiff -I '^#'
ignores lines having these timestamps (lines beginning by #
)Two log files having same content but different timestamps:
$> for ((i=1;i<11;i++)) do echo "09:0${i::1}:00.000 data $i"; done > 1.log
$> for ((i=1;i<11;i++)) do echo "11:00:0${i::1}.000 data $i"; done > 2.log
Basic diff
command line says all lines are different:
$> diff 1.log 2.log
1,10c1,10
< 09:01:00.000 data 1
< 09:02:00.000 data 2
< 09:03:00.000 data 3
< 09:04:00.000 data 4
< 09:05:00.000 data 5
< 09:06:00.000 data 6
< 09:07:00.000 data 7
< 09:08:00.000 data 8
< 09:09:00.000 data 9
< 09:01:00.000 data 10
---
> 11:00:01.000 data 1
> 11:00:02.000 data 2
> 11:00:03.000 data 3
> 11:00:04.000 data 4
> 11:00:05.000 data 5
> 11:00:06.000 data 6
> 11:00:07.000 data 7
> 11:00:08.000 data 8
> 11:00:09.000 data 9
> 11:00:01.000 data 10
Our tricky diff -I '^#'
does not display any difference (timestamps ignored):
$> diff -I '^#' <(sed -r 's/^((.){12})/#\1\n/' 1.log) <(sed -r 's/^((.){12})/#\1\n/' 2.log)
$>
Change 2.log
(replace data
by foo
on the 6th line) and check again:
$> sed '6s/data/foo/' -i 2.log
$> diff -I '^#' <(sed -r 's/^((.){12})/#\1\n/' 1.log) <(sed -r 's/^((.){12})/#\1\n/' 2.log)
11,13c11,13
11,13c11,13
< #09:06:00.000
< data 6
< #09:07:00.000
---
> #11:00:06.000
> foo 6
> #11:00:07.000
=> timestamps are kept in the diff
output!
You can also use the side by side feature using -y
or --side-by-side
option:
$> diff -y -I '^#' <(sed -r 's/^((.){12})/#\1\n/' 1.log) <(sed -r 's/^((.){12})/#\1\n/' 2.log)
#09:01:00.000 #11:00:01.000
data 1 data 1
#09:02:00.000 #11:00:02.000
data 2 data 2
#09:03:00.000 #11:00:03.000
data 3 data 3
#09:04:00.000 #11:00:04.000
data 4 data 4
#09:05:00.000 #11:00:05.000
data 5 data 5
#09:06:00.000 | #11:00:06.000
data 6 | foo 6
#09:07:00.000 | #11:00:07.000
data 7 data 7
#09:08:00.000 #11:00:08.000
data 8 data 8
#09:09:00.000 #11:00:09.000
data 9 data 9
#09:01:00.000 #11:00:01.000
data 10 data 10
sed
If your sed
implementation does not support the -r
option, you may have to count the twelve dots <(sed 's/^\(............\)/#\1\n/' 1.log)
or use another pattern of your choice ;)