Join two files including unmatched lines in Shell

孤街醉人 提交于 2020-01-05 02:50:08

问题


File1.log

207.46.13.90  37556
157.55.39.51  34268
40.77.167.109 21824
157.55.39.253 19683

File2.log

207.46.13.90  62343
157.55.39.51  58451
157.55.39.200 37675
40.77.167.109 21824

Below should be expected Output.log

207.46.13.90    37556   62343
157.55.39.51    34268   58451
157.55.39.200   -----   37675
40.77.167.109   21824   21824
157.55.39.253   19683   -----

I tried with the below 'join' command - but it skips the missing line

join --nocheck-order File1.log File2.log

outputting like below (not as expected)

207.46.13.90  37556 62343
157.55.39.51  34268 58451
40.77.167.109 21824 21824

Could someone please help with the proper command for the desired output. Thanks in advance


回答1:


Could you please try following.

awk '
FNR==NR{
  a[$1]=$2
  next
}
($1 in a){
  print $0,a[$1]
  b[$1]
  next
}
{
  print $1,$2 " ----- "
}
END{
  for(i in a){
    if(!(i in b)){
      print i" ----- "a[i]
    }
  }
}
'  Input_file2  Input_file1

Output will be as follows.

207.46.13.90  37556 62343
157.55.39.51  34268 58451
40.77.167.109 21824 21824
157.55.39.253 19683 -----
157.55.39.200 ----- 37675



回答2:


The following is just enough if you don't care about sorting order of the output:

join -a1 -a2 -e----- -oauto <(sort file1.log) <(sort file2.log) |
column -t -s' ' -o'   '

with recreation of the input files:

cat <<EOF >file1.log
207.46.13.90  37556
157.55.39.51  34268
40.77.167.109 21824
157.55.39.253 19683
EOF
cat <<EOF >file2.log
207.46.13.90  62343
157.55.39.51  58451
157.55.39.200 37675
40.77.167.109 21824
EOF

outputs:

157.55.39.200   -----   37675
157.55.39.253   19683   -----
157.55.39.51    34268   58451
207.46.13.90    37556   62343
40.77.167.109   21824   21824

join by default joins by the first columns. The -a1 -a2 make it print the unmatched lines from both inputs. The -e----- prints unknown columns as dots. The -oauto determinates the output from the columns of the inputs. Because we want to sort on the first column, we don't need to specif -k1 to sort, but sort -s -k1 could speed things up. To match the expected output, I also piped to column.

You can sort the output by ports by pipeing it to for example to sort -rnk2,3.



来源:https://stackoverflow.com/questions/59394583/join-two-files-including-unmatched-lines-in-shell

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!