Finding elements that do not overlap between two vectors

試著忘記壹切 提交于 2019-11-30 08:52:54

Yes, there is a way:

setdiff(list.a, list.b)
# [1] "Mary"     "Jack"     "Michelle"

I think it should be mentioned that the accpeted answer is is only partially correct. The command setdiff(list.a, list.b) finds the non-overlapping elements only if these elements are contained in the object that is used as the first argument!.

If you are not aware of this behaviour and did setdiff(list.b, list.a) instead, the results would be character(0) in this case which would lead you to conclude that there are no non-overlapping elements.

Using a slightly extended example for illustration, an obvious quick fix is:

list.a <- c("James", "Mary", "Jack", "Sonia", "Michelle", "Vincent")
list.b <- c("James", "Sonia", "Vincent", "Iris")

c(setdiff(list.b, list.a), setdiff(list.a, list.b))
# [1] "Iris"     "Mary"     "Jack"     "Michelle" 

An extended answer based on the comments from Hadley and myself: here's how to allow for duplicates.

Final Edit: I do not recommend anyone use this, because the result may not be what you expect. If there is a repeated value in x which is not in y, you will see that value repeated in the output. But: if, say, there are four 9s in x and one 9 in y, all the 9s will be removed. One might expect to retain three of them; that takes messier code.

mysetdiff<-function (x, y, multiple=FALSE) 
{
    x <- as.vector(x)
    y <- as.vector(y)
    if (length(x) || length(y)) {
        if (!multiple) {
             unique( x[match(x, y, 0L) == 0L])  
              }else  x[match(x, y, 0L) == 0L] 
        } else x
}

Rgames> x
[1]  8  9  6 10  9
Rgames> y
[1] 5 3 8 8 1
Rgames> setdiff(x,y)
[1]  9  6 10
Rgames> mysetdiff(x,y)
[1]  9  6 10
Rgames> mysetdiff(x,y,mult=T)
[1]  9  6 10  9
Rgames> mysetdiff(y,x,mult=T)
[1] 5 3 1
Rgames> setdiff(y,x)
[1] 5 3 1

A nice one-liner that applies to duplicates:

anti_join(data_frame(c(1,1,2,2)), data_frame(c(1,1)))

This returns the data frame {2,2}. This however doesn't apply to the case of 1,2 in 1,1,2,2, because it finds it twice

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!