R seems to support an efficient NA
value in floating point arrays. How does it represent it internally?
My (perhaps flawed) understanding is that moder
R uses NaN values as defined for IEEE floats to represent NA_real_
, Inf
and NA
. We can use a simple C++ function to make this explicit:
Rcpp::cppFunction('void print_hex(double x) {
uint64_t y;
static_assert(sizeof x == sizeof y, "Size does not match!");
std::memcpy(&y, &x, sizeof y);
Rcpp::Rcout << std::hex << y << std::endl;
}', plugins = "cpp11", includes = "#include <cstdint>")
print_hex(NA_real_)
#> 7ff80000000007a2
print_hex(Inf)
#> 7ff0000000000000
print_hex(-Inf)
#> fff0000000000000
The exponent (second till 13. bit) is all one. This is the definition of an IEEE NaN. But while for Inf
the mantissa is all zero, this is not the case for NA_real_
. Here some source
code
references.