Sort and keep a unique duplicate which has the highest value

China☆狼群 提交于 2019-11-26 23:32:50

问题


I have a file like the one shown below, I want to keep the combinations between the first and second field which has the highest value on the third field(the ones with the arrows, arrows are not included in the actual file) .

1   1   10
1   1   12        <- 
1   2   6         <-
1   3   4         <- 
2   4   32
2   4   37
2   4   39
2   4   40        <- 
2   45  12
2   45  15        <- 
3   3   12
3   3   15
3   3   17
3   3   19        <- 
3   15  4
3   15  9         <- 
4   17  25
4   17  28
4   17  32
4   17  36        <- 
4   18  4         <- 

in order to have and output like this:

1   1   12
1   2   6
1   3   4
2   4   40
2   45  15
3   3   19
3   15  9
4   17  36
4   18  4

And I thought maybe I just play with the sort and uniq command, but I made a mess.

Any ideas?

Very important note: the entries are not neatly sorted from the beginning, I just used sort -k1,1 -k2,2 -k3,3

Thanks in advance guys


回答1:


This is a bit funny, but:

sort -nr myfile.txt | rev | uniq -f1 | rev | sort -n

Output:

1   1   12
1   2   6 
1   3   4 
2   4   40
2   45  15
3   15  9 
3   3   19
4   17  36
4   18  4 

How it works:

  • Sort reverse numerically, putting the highest values at the top (so they are saved)
  • Reverse each line, so the last field is first (needed for uniq)
  • Save only the first uniq line, but ignoring the first field (was the last field)
  • Reverse the line back to original order
  • Sort the lines from low to high again

Probably not the most efficient in the world, but at least each step makes some sense.




回答2:


Two passes of sort should do it, for example in bash shell

sort -k1,1n -k2,2n -k3,3nr -t$'\t'  file  | sort -k1,1n -k2,2n -t$'\t' -u -s
1       1       12
1       2       6
1       3       4
2       4       40
2       45      15
3       3       19
3       15      9
4       17      36
4       18      4


来源:https://stackoverflow.com/questions/22822465/sort-and-keep-a-unique-duplicate-which-has-the-highest-value

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!