Fastest way to detect if vector has at least 1 NA?

后端 未结 6 1097
南旧
南旧 2020-12-23 14:50

What is the fastest way to detect if a vector has at least 1 NA in R? I\'ve been using:

sum( is.na( data ) ) > 0

But that

6条回答
  •  一向
    一向 (楼主)
    2020-12-23 15:11

    As of R 3.1.0 anyNA() is the way to do this. On atomic vectors this will stop after the first NA instead of going through the entire vector as would be the case with any(is.na()). Additionally, this avoids creating an intermediate logical vector with is.na that is immediately discarded. Borrowing Joran's example:

    x <- y <- runif(1e7)
    x[1e4] <- NA
    y[1e7] <- NA
    microbenchmark::microbenchmark(any(is.na(x)), anyNA(x), any(is.na(y)), anyNA(y), times=10)
    # Unit: microseconds
    #           expr        min         lq        mean      median         uq
    #  any(is.na(x))  13444.674  13509.454  21191.9025  13639.3065  13917.592
    #       anyNA(x)      6.840     13.187     13.5283     14.1705     14.774
    #  any(is.na(y)) 165030.942 168258.159 178954.6499 169966.1440 197591.168
    #       anyNA(y)   7193.784   7285.107   7694.1785   7497.9265   7865.064
    

    Notice how it is substantially faster even when we modify the last value of the vector; this is in part because of the avoidance of the intermediate logical vector.

提交回复
热议问题