R Subset data.frame from max value of one vector and grouped by another [duplicate]

这一生的挚爱 提交于 2019-12-21 14:59:06

问题


>ID<-c('A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C')
>WK<-c(1, 2, 3, 1, 2, 3, 1, 2, 3, 4, 5)
>NumSuccess<-c(0, 0, 2, 0, 0, 1, 0, 0, 0, 0, 3)
>Data<-data.frame(ID, WK, NumSuccess)

I am trying to create a subset data.frame "Data2" based on the value in "NumSuccesses" that corresponds to the Max Value in "WK" grouped by "ID". Resulting data.frame should look like this:

>ID<-c('A','B','C')
>WK<-c(3, 3, 5)
>NumSuccess<-c(2, 1, 3)
>Data2<-data.frame(ID, WK, NumSuccess)

回答1:


This could be done in more than one way. If there are ties for 'WK', maximum value per each 'ID' and want all the rows with the maximum 'WK', it may be useful to filter with the logical condition (WK==max(WK)) after grouping by the 'ID'.

library(dplyr)
Data %>% 
      group_by(ID) %>% 
      filter(WK==max(WK))
#   ID WK NumSuccess
#1  A  3          2
#2  B  3          1
#3  C  5          3

If there is a single 'max' value for 'WK' per 'ID', we can use which.max or use arrange to order the dataset by 'WK' for each 'ID'

  Data %>% 
       group_by(ID) %>%
       slice(which.max(WK))
  #     ID WK NumSuccess
  #1  A  3          2
  #2  B  3          1
  #3  C  5          3

 Data %>%
       group_by(ID) %>%
       arrange(-WK) %>% 
       slice(1)

Similar approaches in data.table are

library(data.table)
setDT(Data)[, .SD[max(WK)==WK], ID]
setDT(Data)[, .SD[which.max(WK)], ID]
setkey(setDT(Data), WK)[, .SD[.N], ID]

Or we can use ave from base R

 Data[with(Data, ave(WK, ID, FUN=max)==WK),]


来源:https://stackoverflow.com/questions/29586382/r-subset-data-frame-from-max-value-of-one-vector-and-grouped-by-another

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!