matrix %in% matrix

后端未结

关注

 5  915

北恋 2020-12-08 23:24

Suppose I have two matrices, each with two columns and differing numbers of row. I want to check and see which pairs of one matrix are in the other matrix. If these were one

5条回答

生来不讨喜 (楼主)

2020-12-08 23:42

Coming in late to the game: I had previously written an algorithm using the "paste with delimiter" method, and then found this page. I was guessing that one of the code snippets here would be the fastest, but:

andrie<-function(mfoo,nfoo) apply(mfoo, 1, `%inm%`, nfoo)
# using Andrie's %inm% operator exactly as above
carl<-function(mfoo,nfoo) {
 allrows<-unlist(sapply(1:nrow(mfoo),function(j) paste(mfoo[j,],collapse='_'))) 
 allfoo <- unlist(sapply(1:nrow(nfoo),function(j) paste(nfoo[j,],collapse='_')))
 thewalls<-setdiff(allrows,allfoo)
 dowalls<-mfoo[allrows%in%thewalls,]
}

 ramnath <- function (a,x) apply(a, 1, digest) %in% apply(x, 1, digest)

 mfoo<-matrix( sample(1:100,400,rep=TRUE),nr=100)
 nfoo<-mfoo[sample(1:100,60),]

 library(microbenchmark)
 microbenchmark(andrie(mfoo,nfoo),carl(mfoo,nfoo),ramnath(mfoo,nfoo),times=5)

Unit: milliseconds
                expr       min        lq    median        uq            max neval
  andrie(mfoo, nfoo) 25.564196 26.527632 27.964448 29.687344     102.802004     5
    carl(mfoo, nfoo)  1.020310  1.079323  1.096855  1.193926       1.246523     5
 ramnath(mfoo, nfoo)  8.176164  8.429318  8.539644  9.258480       9.458608     5

So apparently constructing character strings and doing a single set operation is fastest! (PS I checked and all 3 algorithms give the same result)

0 讨论(0)

查看其它5个回答