问题
Given the following test matrix:
testMatrix <- matrix( c(1,1,2,10,20,30,300,100,200,"A","B","C"), 3, 4)
colnames(testMatrix) <- c("GroupID", "ElementID", "Value", "Name")
Here I want to find the max per group and then return the name of that column. E.g. I would expect 1, A and 2, C. If there is a tie with max, the first match would be fine. After that I would have to attach this to the matrix with a new Column "GroupName"
How can I do this?
I already have the Group, Max Value combination:
groupMax <- aggregate (as.numeric(testMatrix[,3]), by=list( testMatrix[,1] ), max )
The way I used to add columns to my matrix works like this (let's assume there is also already a matrix groupNames with GroupID, Name combinations):
testMatrix <- cbind ( testMatrix, groupNames[match( testMatrix[,1], groupNames[,1] ), 2] )
回答1:
Base solution, not as simple as Dan M's:
testMatrix <- data.frame(GroupID = c(1,1,2), ElementID = c(10,20,30),
Value=c(300,100,200), Name=c("A","B","C"))
A <- lapply(split(testMatrix, testMatrix$GroupID), function(x) {
x[which.max(x$Value), c(1, 4)]
}
)
do.call(rbind, A)
回答2:
A data.table
solution for time and memory efficiency and syntactic elegance
library(data.table)
DT <- as.data.table(testMatrix)
DT[,list(Name = Name[which.max(Value)]),by = GroupID]
回答3:
As @Tyler said, a data.frame is easier to work with. Here's an option:
testMatrix <- data.frame(GroupID = c(1,1,2), ElementID = c(10,20,30), Value=c(300,100,200), Name=c("A","B","C"))
ddply(testMatrix, .(GroupID), summarize, Name=Name[which.max(Value)])
回答4:
I figured out a nice way to do this via dplyr
filter(group_by(testMatrix,GroupID),min_rank(desc(Value))==1)
来源:https://stackoverflow.com/questions/12039681/find-max-per-group-and-return-another-column