问题
So I have a file such as:
10 1 abc
10 2 def
10 3 ghi
20 4 elm
20 5 nop
20 6 qrs
30 3 tuv
I would like to get the maximum value of the second column for each value of the first column, i.e.:
10 3 ghi
20 6 qrs
30 3 tuv
How can I do using awk
or similar unix commands?
回答1:
You can use awk
:
awk '$2>max[$1]{max[$1]=$2; row[$1]=$0} END{for (i in row) print row[i]}' file
Output:
10 3 ghi
20 6 qrs
30 3 tuv
Explanation:
awk command uses an associative array max
with key as $1
and value as $2
. Every time we encounter a value already stored in this associative array max
, we update our previous entry and store whole row in another associative array row
with the same key. Finally in END
section we simply iterate over associative array row
and print it.
回答2:
shorter alternative with sort
$ sort -k1,1 -k2,2nr file | sort -u -k1,1
10 3 ghi
20 6 qrs
30 3 tuv
sort by field one and field two (numeric, reverse) so that max for each key will be top of the group, pick the first for each key by the second sort.
来源:https://stackoverflow.com/questions/35069048/using-awk-to-get-the-maximum-value-of-a-column-for-each-unique-value-of-another