Remove all unique rows

时光毁灭记忆、已成空白 提交于 2019-12-17 20:36:37

问题


I am trying to figure out how to remove all unique rows, from a data frame, but if it has a duplicate, I want that to stay in. For Example - I want all columns from this with col1 the same:

df<-data.frame(col1=c(rep("a",3),"b","c",rep("d",3)),col2=c("A","B","C",rep("A",3),"B","C"),col3=c(3,3,1,4,4,3,2,1))
df
  col1 col2 col3
1    a    A    3
2    a    B    3
3    a    C    1
4    b    A    4
5    c    A    4
6    d    A    3
7    d    B    2
8    d    C    1

subset(df,duplicated(col1))
  col1 col2 col3
2    a    B    3
3    a    C    1
7    d    B    2
8    d    C    1

But I want to have rows 1,2,3,6,7,8 since they all have the same col 1. How do I get 1 and 6 to be included? Or, conversely, how do I remove rows that do not have a duplicate?


回答1:


Another option:

subset(df,duplicated(col1) | duplicated(col1, fromLast=TRUE))



回答2:


Try:

> tdf <- table(df$col1)
a b c d 
3 1 1 3 

df[df$col1 %in% names(tdf)[tdf>1],]
> df
  col1 col2 col3
1    a    A    3
2    a    B    3
3    a    C    1
6    d    A    3
7    d    B    2
8    d    C    1



回答3:


You can do this by creating an index with ave:

df[as.logical(ave(1:nrow(df), df$col1, FUN=function(x) length(x) > 1)), ]

produces

  col1 col2 col3
1    a    A    3
2    a    B    3
3    a    C    1
6    d    A    3
7    d    B    2
8    d    C    1


来源:https://stackoverflow.com/questions/21946201/remove-all-unique-rows

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!