My dataframe looks like this:
x1 <- c(\"a\", \"c\", \"f\", \"j\")
x2 <- c(\"b\", \"c\", \"g\", \"k\")
x3 <- c(\"b\", \"d\", \"h\", NA)
x4 <- c(\"
As another idea, trying to preserve and operate on the "list" structure of a "data.frame" and not converting it to atomic (i.e. sapply
, as.matrix
, do.call(_bind, ...)
etc.) could be efficient. In this case we could use something like:
as.numeric(Reduce("|", lapply(df, function(x) x %in% vec)))
#[1] 1 0 1 0
And to compare with -the fastest so far- Ananda Mahto's apporach (using the larger "df"):
AL = function() as.numeric(Reduce("|", lapply(df, function(x) x %in% vec)))
AM = function() as.numeric(rowSums(`dim<-`(as.matrix(df) %in% vec, dim(df))) >= 1)
identical(AM(), AL())
#[1] TRUE
microbenchmark::microbenchmark(AM(), AL(), times = 50)
#Unit: milliseconds
# expr min lq median uq max neval
# AM() 49.20072 53.53789 58.03740 66.76898 86.04280 50
# AL() 45.24706 49.34271 51.43577 55.05866 74.79533 50
There does not appear any significant efficiency gain, but, I guess, it's worth noting that the 2 loops (in Reduce
and lapply
) didn't prove to be as slow as -probably- would be expected.