Search and find values row-wise a dataframe

后端 未结 4 1983
后悔当初
后悔当初 2021-01-06 11:00

My dataframe looks like this:

x1 <- c(\"a\", \"c\", \"f\", \"j\")
x2 <- c(\"b\", \"c\", \"g\", \"k\")
x3 <- c(\"b\", \"d\", \"h\", NA)
x4 <- c(\"         


        
4条回答
  •  梦毁少年i
    2021-01-06 11:14

    As another idea, trying to preserve and operate on the "list" structure of a "data.frame" and not converting it to atomic (i.e. sapply, as.matrix, do.call(_bind, ...) etc.) could be efficient. In this case we could use something like:

    as.numeric(Reduce("|", lapply(df, function(x) x %in% vec)))
    #[1] 1 0 1 0
    

    And to compare with -the fastest so far- Ananda Mahto's apporach (using the larger "df"):

    AL = function() as.numeric(Reduce("|", lapply(df, function(x) x %in% vec)))
    AM = function() as.numeric(rowSums(`dim<-`(as.matrix(df) %in% vec, dim(df))) >= 1)
    identical(AM(), AL())
    #[1] TRUE
    microbenchmark::microbenchmark(AM(), AL(), times = 50)
    #Unit: milliseconds
    # expr      min       lq   median       uq      max neval
    # AM() 49.20072 53.53789 58.03740 66.76898 86.04280    50
    # AL() 45.24706 49.34271 51.43577 55.05866 74.79533    50
    

    There does not appear any significant efficiency gain, but, I guess, it's worth noting that the 2 loops (in Reduce and lapply) didn't prove to be as slow as -probably- would be expected.

提交回复
热议问题