Compare files with awk

风格不统一 提交于 2019-11-27 19:11:57

To print the common elements in both files:

$ awk 'NR==FNR{a[$1];next}$1 in a{print $1}' file1 file2
"aba"
"abc"
"xxx"

Explanation:

NR and FNR are awk variables that store the total number of records and the number of records in the current files respectively (the default record is a line).

NR==FNR # Only true when in the first file 
{
    a[$1] # Build associative array on the first column of the file
    next  # Skip all proceeding blocks and process next line
}
($1 in a) # Check in the value in column one of the second files is in the array
{
    # If so print it
    print $1
}

If you want to match the whole lines then use $0:

$ awk 'NR==FNR{a[$0];next}$0 in a{print $0}' file1 file2
"aba" 0 0
"xxx" 0 0

Or a specific set of columns:

$ awk 'NR==FNR{a[$1,$2,$3];next}($1,$2,$3) in a{print $1,$2,$3}' file1 file2
"aba" 0 0
"xxx" 0 0

To print the number of matching elements, here's one way using awk:

awk 'FNR==NR { a[$1]; next } $1 in a { c++ } END { print c }' file1.txt file2.txt

Results using your input:

3

If you'd like to add extra columns (for example, columns one, two and three), use a pseudo-multidimensional array:

awk 'FNR==NR { a[$1,$2,$3]; next } ($1,$2,$3) in a { c++ } END { print c }' file1.txt file2.txt

Results using your input:

2
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!