Find max per group and return another column

﹥>﹥吖頭↗ 提交于 2019-11-28 03:40:22

问题


Given the following test matrix:

testMatrix <- matrix( c(1,1,2,10,20,30,300,100,200,"A","B","C"), 3, 4)

colnames(testMatrix) <- c("GroupID", "ElementID", "Value", "Name")

Here I want to find the max per group and then return the name of that column. E.g. I would expect 1, A and 2, C. If there is a tie with max, the first match would be fine. After that I would have to attach this to the matrix with a new Column "GroupName"

How can I do this?

I already have the Group, Max Value combination:

groupMax <- aggregate (as.numeric(testMatrix[,3]), by=list( testMatrix[,1] ), max )

The way I used to add columns to my matrix works like this (let's assume there is also already a matrix groupNames with GroupID, Name combinations):

testMatrix <- cbind ( testMatrix, groupNames[match( testMatrix[,1], groupNames[,1] ), 2] ) 

回答1:


Base solution, not as simple as Dan M's:

testMatrix <- data.frame(GroupID = c(1,1,2), ElementID = c(10,20,30), 
    Value=c(300,100,200), Name=c("A","B","C"))

A <- lapply(split(testMatrix, testMatrix$GroupID), function(x) {
        x[which.max(x$Value), c(1, 4)]
    }
)
do.call(rbind, A)



回答2:


A data.table solution for time and memory efficiency and syntactic elegance

library(data.table)
DT <- as.data.table(testMatrix)
DT[,list(Name = Name[which.max(Value)]),by = GroupID] 



回答3:


As @Tyler said, a data.frame is easier to work with. Here's an option:

testMatrix <- data.frame(GroupID = c(1,1,2), ElementID = c(10,20,30), Value=c(300,100,200), Name=c("A","B","C"))
ddply(testMatrix, .(GroupID), summarize, Name=Name[which.max(Value)])



回答4:


I figured out a nice way to do this via dplyr

filter(group_by(testMatrix,GroupID),min_rank(desc(Value))==1)


来源:https://stackoverflow.com/questions/12039681/find-max-per-group-and-return-another-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!