Using awk to get the maximum value of a column, for each unique value of another column

问题

So I have a file such as:

10 1 abc
10 2 def
10 3 ghi
20 4 elm
20 5 nop
20 6 qrs
30 3 tuv

I would like to get the maximum value of the second column for each value of the first column, i.e.:

10 3 ghi
20 6 qrs
30 3 tuv

How can I do using awk or similar unix commands?

回答1:

You can use awk:

awk '$2>max[$1]{max[$1]=$2; row[$1]=$0} END{for (i in row) print row[i]}' file

Output:

10 3 ghi
20 6 qrs
30 3 tuv

Explanation:

awk command uses an associative array max with key as $1 and value as $2. Every time we encounter a value already stored in this associative array max, we update our previous entry and store whole row in another associative array row with the same key. Finally in END section we simply iterate over associative array row and print it.

回答2:

shorter alternative with sort

$ sort -k1,1 -k2,2nr file | sort -u -k1,1

10 3 ghi
20 6 qrs
30 3 tuv

sort by field one and field two (numeric, reverse) so that max for each key will be top of the group, pick the first for each key by the second sort.

来源：https://stackoverflow.com/questions/35069048/using-awk-to-get-the-maximum-value-of-a-column-for-each-unique-value-of-another

标签

bash

sorting

awk

uniq

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!