Can std::numeric_limits::quiet_NaN double/float store some extra info

问题

When storing double data in my data acquisition project, I identify all "missing" data using std::numeric_limits::quiet_NaN(). However, I'd like to store some extra information to know why the data is "missing" (data transmission lost, bad checksum, no measurement done, internal error....) so I need many different "nan" values in the end. And they must all be identified as NaN by any legacy code (x!=x).

I see in IEEE 754-1985 that NaN fraction could be "anything except all 0 bits (since all 0 bits represents infinity).". Can the fraction be used to safely store some extra info? If yes, how should I do this? Would this be totally safe on all platform and with any compiler?

Here is what I was thinking about:

double GetMyNaN1()
{
    double value = std::numeric_limits<double>::quiet_NaN();
    // customize it!
    return value;
}

double GetMyNaN2()
{
    double value = std::numeric_limits<double>::quiet_NaN();
    // customize it!
    return value;
}

bool IsMyNan1( double value )
{
    // return true if value was created by GetMyNaN1() 
}

bool IsMyNan2( double value )
{
    // return true if value was created by GetMyNaN2() 
}

int main()
{
    double regular_nan = std::numeric_limits<double>::quiet_NaN();
    double my_nan_1 = GetMyNaN1();
    double my_nan_2 = GetMyNaN2();

    assert( std::isnan( regular_nan ) && !IsMyNan1( regular_nan ) && !IsMyNan2( regular_nan ) );
    assert( std::isnan( my_nan_1 ) && IsMyNan1( my_nan_1 ) && !IsMyNan2( my_nan_1 ) );
    assert( std::isnan( my_nan_2 ) && !IsMyNan1( my_nan_2 ) && IsMyNan2( my_nan_2 ) );
    return 0;
}

The code must work on all platform.

回答1:

This is known as NaN-boxing. It’s very widely used, but there’s no language-defined way of doing it since (as usual) the bit layout isn’t specified. On real implementations, with care you can get the right behavior via the obvious bit operations even though formally it’s undefined (if you use type punning via reinterpret_cast or a union) or at best unspecified (if you use memcpy or bit_cast).

回答2:

Using NaN-boxing as recommanded by Davis, I could easily implement this with code working under Windows (MSVC) and Linux (gcc). Which is good enough for my needs.

#include <iostream>
#include <assert.h>
#include <limits>
#include <bitset>
#include <cmath>

void showValue( double val, const std::string& what )
{
    union udouble {
      double d;
      unsigned long long u;
    };
    udouble ud;
    ud.d = val;
    std::bitset<sizeof(double) * 8> b(ud.u);
    std::cout << val << " (" << what << "): " << b.to_string() << std::endl;
}

double customizeNaN( double value, char mask )
{
    double res = value;
    char* ptr = (char*) &res;
    assert( ptr[0] == 0 );
    ptr[0] |= mask;
    return res;
}

bool isCustomNaN( double value, char mask )
{
    char* ptr = (char*) &value;
    return ptr[0] == mask;
}

int main(int argc, char *argv[])
{
    double regular_nan = std::numeric_limits<double>::quiet_NaN();
    double myNaN1 = customizeNaN( regular_nan, 0x01 );
    double myNaN2 = customizeNaN( regular_nan, 0x02 );

    showValue( regular_nan, "regular" );
    showValue( myNaN1, "custom 1" );
    showValue( myNaN2, "custom 2" );

    assert( std::isnan(regular_nan) );
    assert( std::isnan(myNaN1) );
    assert( std::isnan(myNaN2) );

    assert( !isCustomNaN(regular_nan,0x01) );
    assert( isCustomNaN(myNaN1,0x01) );
    assert( !isCustomNaN(myNaN2,0x01) );

    assert( !isCustomNaN(regular_nan,0x02) );
    assert( !isCustomNaN(myNaN1,0x02) );
    assert( isCustomNaN(myNaN2,0x02) );

    return 0;
}

This code assumes quiet_NaN is always: 0111111111111000000000000000000000000000000000000000000000000000: 0, 11 bits set to 1,then 1000000000000000000000000000000000000000000000000000

The code could be adapted to:

Support both float/double through a template implementation
Support big/little endianess (to decide where the mask should be applied)
Support any nan representation (with my assumption last 8 bits are 0 and can be used as a mask, IEEE 754-1985 makes it possible to represent nan differently, for instance: 0111111111110000000000000000000000000000000000000000000000000001, then using last 8 bits as a mask would be a bad idea). But there will always be a way to customize the fraction as it will always be considered as a NaN as far as you don't end up with all bits set to 0 (which would then represent +Inf instead of NaN).

Edit: Note that this implementation is not that good as the extended info is lost when casting from float to double. See my answer to std::num_put issue with nan-boxing due to auto-cast from float to double for anoother implementation that is safer.

来源：https://stackoverflow.com/questions/53629760/can-stdnumeric-limitsquiet-nan-double-float-store-some-extra-info

标签

c++

c++11

nan