R - subset column based on condition on duplicate rows

旧巷老猫 提交于 2019-12-02 03:59:54

The expected output is not very clear. May be this helps:

 indx <- with(DF, ave(!Site_count, ID, FUN=function(x) sum(x)>1))
 DF[!(duplicated(DF$ID) & indx),]

Update

After re-reading the description, your expected answer could also be:

 indx <- with(DF, ave(Site_count, ID, FUN=function(x) any(x>0)))
 DF[!(duplicated(DF$ID) & indx),]

Possibly this:

set.seed(42)
DF <- data.frame(
  'ID' = c(sample(1:3, 10, replace=T), 4),
  'Site_count' = c(sample(0:3, 10, replace=T), 0)
)

#   ID Site_count
#1   3          1
#2   3          2
#3   1          3
#4   3          1
#5   2          1
#6   2          3
#7   3          3
#8   1          0
#9   2          1
#10  3          2
#11  4          0


fun <- function(x) {
  if (length(x) == 1L) return(x) else {
    return(x[which.max(x > 0)])
  }
}
library(plyr)
ddply(DF, .(ID), summarise, Site_count = fun(Site_count))
#  ID Site_count
#1  1          3
#2  2          1
#3  3          1
#4  4          0
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!