Convert extended precision float (80-bit) to double (64-bit) in MSVC

前端 未结 4 2234
暖寄归人
暖寄归人 2020-12-15 13:25

What is the most portable and \"right\" way to do conversion from extended precision float (80-bit value, also known as long double in some compilers) to double

4条回答
  •  甜味超标
    2020-12-15 14:23

    If your compiler / platform doesn't have native support for 80 bit floating point values, you have to decode the value yourself.

    Assuming that the 80 bit float is stored within a byte buffer, located at a specific offset, you can do it like this:

    float64 C_IOHandler::readFloat80(IColl buffer, uint32 *ref_offset)
    {
        uint32 &offset = *ref_offset;
    
        //80 bit floating point value according to the IEEE-754 specification and the Standard Apple Numeric Environment specification:
        //1 bit sign, 15 bit exponent, 1 bit normalization indication, 63 bit mantissa
    
        float64 sign;
        if ((buffer[offset] & 0x80) == 0x00)
            sign = 1;
        else
            sign = -1;
        uint32 exponent = (((uint32)buffer[offset] & 0x7F) << 8) | (uint32)buffer[offset + 1];
        uint64 mantissa = readUInt64BE(buffer, offset + 2);
    
        //If the highest bit of the mantissa is set, then this is a normalized number.
        float64 normalizeCorrection;
        if ((mantissa & 0x8000000000000000) != 0x00)
            normalizeCorrection = 1;
        else
            normalizeCorrection = 0;
        mantissa &= 0x7FFFFFFFFFFFFFFF;
    
        offset += 10;
    
        //value = (-1) ^ s * (normalizeCorrection + m / 2 ^ 63) * 2 ^ (e - 16383)
        return (sign * (normalizeCorrection + (float64)mantissa / ((uint64)1 << 63)) * g_Math->toPower(2, (int32)exponent - 16383));
    }
    

    This is how I did it, and it compiles fine with g++ 4.5.0. It of course isn't a very fast solution, but at least a functional one. This code should also be portable to different platforms, though I didn't try.

提交回复
热议问题