subset based on frequency level [duplicate]

柔情痞子 提交于 2019-12-01 00:21:18
df1[ df1$ID %in%  names(table(df1$ID))[table(df1$ID) >9] , ]

This will test to see if the df1$ID value is in a category with more than 9 values. If it is, then the logical element for the returned vector will be TRUE and in turn that as the "i" argument will cause the [-function to return the entire row since the "j" item is empty.

See:

?`[`
?'%in%'

Maybe closer to what you had in mind is to create a vector of frequencies using ave:

subset(df1, ave(ID, ID, FUN = length) > cutoff)

Using dplyr

library(dplyr)
 df1 %>% 
 group_by(ID) %>% 
 filter(n()>cutoff)
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!