Using awk how do I print all lines containing duplicates of specific columns?
问题 Input: a;3;c;1 a;4;b;2 a;5;c;1 Output: a;3;c;1 a;5;c;1 Hence, all lines which have duplicates of columns 1,3 and 4 should be printed. 回答1: If a 2-pass approach is OK: $ awk -F';' '{key=$1 FS $3 FS $4} NR==FNR{cnt[key]++;next} cnt[key]>1' file file a;3;c;1 a;5;c;1 otherwise: $ awk -F';' ' { key=$1 FS $3 FS $4; a[key,++cnt[key]]=$0 } END { for (key in cnt) if (cnt[key] > 1) for (i=1; i<=cnt[key]; i++) print a[key,i] } ' file a;3;c;1 a;5;c;1 The output order of keys in that second script will be