Ranges of floating point datatype in C?

后端 未结 6 1675
死守一世寂寞
死守一世寂寞 2020-11-30 03:49

I am reading a C book, talking about ranges of floating point, the author gave the table:

Type     Smallest Positive Value  Largest value      Precision
====         


        
6条回答
  •  遥遥无期
    2020-11-30 04:18

    Infinity, NaN and subnormals

    These are important caveats that no other answer has mentioned so far.

    First read this introduction to IEEE 754 and subnormal numbers: What is a subnormal floating point number?

    Then, for single precision floats (32-bit):

    • IEEE 754 says that if the exponent is all ones (0xFF == 255), then it represents either NaN or Infinity.

      This is why the largest non-infinite number has exponent 0xFE == 254 and not 0xFF.

      Then with the bias, it becomes:

      254 - 127 == 127
      
    • FLT_MIN is the smallest normal number. But there are smaller subnormal ones! Those take up the -127 exponent slot.

    All asserts of the following program pass on Ubuntu 18.04 amd64:

    #include 
    #include 
    #include 
    #include 
    #include 
    #include 
    
    float float_from_bytes(
        uint32_t sign,
        uint32_t exponent,
        uint32_t fraction
    ) {
        uint32_t bytes;
        bytes = 0;
        bytes |= sign;
        bytes <<= 8;
        bytes |= exponent;
        bytes <<= 23;
        bytes |= fraction;
        return *(float*)&bytes;
    }
    
    int main(void) {
        /* All 1 exponent and non-0 fraction means NaN.
         * There are of course many possible representations,
         * and some have special semantics such as signalling vs not.
         */
        assert(isnan(float_from_bytes(0, 0xFF, 1)));
        assert(isnan(NAN));
        printf("nan                  = %e\n", NAN);
    
        /* All 1 exponent and 0 fraction means infinity. */
        assert(INFINITY == float_from_bytes(0, 0xFF, 0));
        assert(isinf(INFINITY));
        printf("infinity             = %e\n", INFINITY);
    
        /* ANSI C defines FLT_MAX as the largest non-infinite number. */
        assert(FLT_MAX == 0x1.FFFFFEp127f);
        /* Not 0xFF because that is infinite. */
        assert(FLT_MAX == float_from_bytes(0, 0xFE, 0x7FFFFF));
        assert(!isinf(FLT_MAX));
        assert(FLT_MAX < INFINITY);
        printf("largest non infinite = %e\n", FLT_MAX);
    
        /* ANSI C defines FLT_MIN as the smallest non-subnormal number. */
        assert(FLT_MIN == 0x1.0p-126f);
        assert(FLT_MIN == float_from_bytes(0, 1, 0));
        assert(isnormal(FLT_MIN));
        printf("smallest normal      = %e\n", FLT_MIN);
    
        /* The smallest non-zero subnormal number. */
        float smallest_subnormal = float_from_bytes(0, 0, 1);
        assert(smallest_subnormal == 0x0.000002p-126f);
        assert(0.0f < smallest_subnormal);
        assert(!isnormal(smallest_subnormal));
        printf("smallest subnormal   = %e\n", smallest_subnormal);
    
        return EXIT_SUCCESS;
    }
    

    GitHub upstream.

    Compile and run with:

    gcc -ggdb3 -O0 -std=c11 -Wall -Wextra -Wpedantic -Werror -o subnormal.out subnormal.c
    ./subnormal.out
    

    Output:

    nan                  = nan
    infinity             = inf
    largest non infinite = 3.402823e+38
    smallest normal      = 1.175494e-38
    smallest subnormal   = 1.401298e-45
    

提交回复
热议问题