R Subset data.frame from max value of one vector and grouped by another [duplicate]

问题

>ID<-c('A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C')
>WK<-c(1, 2, 3, 1, 2, 3, 1, 2, 3, 4, 5)
>NumSuccess<-c(0, 0, 2, 0, 0, 1, 0, 0, 0, 0, 3)
>Data<-data.frame(ID, WK, NumSuccess)

I am trying to create a subset data.frame "Data2" based on the value in "NumSuccesses" that corresponds to the Max Value in "WK" grouped by "ID". Resulting data.frame should look like this:

>ID<-c('A','B','C')
>WK<-c(3, 3, 5)
>NumSuccess<-c(2, 1, 3)
>Data2<-data.frame(ID, WK, NumSuccess)

回答1:

This could be done in more than one way. If there are ties for 'WK', maximum value per each 'ID' and want all the rows with the maximum 'WK', it may be useful to filter with the logical condition (WK==max(WK)) after grouping by the 'ID'.

library(dplyr)
Data %>% 
      group_by(ID) %>% 
      filter(WK==max(WK))
#   ID WK NumSuccess
#1  A  3          2
#2  B  3          1
#3  C  5          3

If there is a single 'max' value for 'WK' per 'ID', we can use which.max or use arrange to order the dataset by 'WK' for each 'ID'

  Data %>% 
       group_by(ID) %>%
       slice(which.max(WK))
  #     ID WK NumSuccess
  #1  A  3          2
  #2  B  3          1
  #3  C  5          3

 Data %>%
       group_by(ID) %>%
       arrange(-WK) %>% 
       slice(1)

Similar approaches in data.table are

library(data.table)
setDT(Data)[, .SD[max(WK)==WK], ID]
setDT(Data)[, .SD[which.max(WK)], ID]
setkey(setDT(Data), WK)[, .SD[.N], ID]

Or we can use ave from base R

 Data[with(Data, ave(WK, ID, FUN=max)==WK),]

来源：https://stackoverflow.com/questions/29586382/r-subset-data-frame-from-max-value-of-one-vector-and-grouped-by-another

标签

subset