matrix %in% matrix

后端 未结 5 915
北恋
北恋 2020-12-08 23:24

Suppose I have two matrices, each with two columns and differing numbers of row. I want to check and see which pairs of one matrix are in the other matrix. If these were one

5条回答
  •  生来不讨喜
    2020-12-08 23:42

    Coming in late to the game: I had previously written an algorithm using the "paste with delimiter" method, and then found this page. I was guessing that one of the code snippets here would be the fastest, but:

    andrie<-function(mfoo,nfoo) apply(mfoo, 1, `%inm%`, nfoo)
    # using Andrie's %inm% operator exactly as above
    carl<-function(mfoo,nfoo) {
     allrows<-unlist(sapply(1:nrow(mfoo),function(j) paste(mfoo[j,],collapse='_'))) 
     allfoo <- unlist(sapply(1:nrow(nfoo),function(j) paste(nfoo[j,],collapse='_')))
     thewalls<-setdiff(allrows,allfoo)
     dowalls<-mfoo[allrows%in%thewalls,]
    }
    
     ramnath <- function (a,x) apply(a, 1, digest) %in% apply(x, 1, digest)
    
     mfoo<-matrix( sample(1:100,400,rep=TRUE),nr=100)
     nfoo<-mfoo[sample(1:100,60),]
    
     library(microbenchmark)
     microbenchmark(andrie(mfoo,nfoo),carl(mfoo,nfoo),ramnath(mfoo,nfoo),times=5)
    
    Unit: milliseconds
                    expr       min        lq    median        uq            max neval
      andrie(mfoo, nfoo) 25.564196 26.527632 27.964448 29.687344     102.802004     5
        carl(mfoo, nfoo)  1.020310  1.079323  1.096855  1.193926       1.246523     5
     ramnath(mfoo, nfoo)  8.176164  8.429318  8.539644  9.258480       9.458608     5
    

    So apparently constructing character strings and doing a single set operation is fastest! (PS I checked and all 3 algorithms give the same result)

提交回复
热议问题