Floating point arithmetic and machine epsilon

前端 未结 2 1687
北恋
北恋 2020-12-19 10:09

I\'m trying to compute an approximation of the epsilon value for the float type (and I know it\'s already in the standard library).

The epsilon values o

相关标签:
2条回答
  • 2020-12-19 10:30

    The compiler is allowed to evaluate float expressions in any bigger precision it likes, so it looks like the first expression is evaluated in long double precision. In the second expression you enforce scaling the result down to float again.

    In answer to some of your additional questions and the discussion below: you are basically looking for the smallest non-zero difference with 1 of some floating point type. Depending on the setting of FLT_EVAL_METHOD a compiler may decide to evaluate all floating point expressions in a higher precision than the types involved. On a Pentium traditionally the internal registers of the floating point unit are 80 bits and it is convenient to use that precision for all the smaller floating point types. So in the end your test depends on the precision of your compare !=. In the absence of an explicit cast the precision of this comparison is determined by your compiler not by your code. With the explicit cast you scale the comparison down to the type you desire.

    As you confirmed your compiler has set FLT_EVAL_METHOD to 2 so it uses the highest precision for any floating point calculation.

    As a conclusion to the discussion below we are confident to say that there is a bug relating to implementation of the FLT_EVAL_METHOD=2 case in gcc prior to version 4.5 and that is fixed from of at least version 4.6. If the integer constant 2 is used in the expression instead of the floating point constant 2.0, the cast to float is omitted in the generated assembly. It is also worth noticing that from of optimization level -O1 the right results are produced on these older compilers, but the generated assembly is quite different and contains only few floating point operations.

    0 讨论(0)
  • 2020-12-19 10:33

    A C99 C compiler can evaluate floating-point expressions as if they were of a more precise floating-point type than their actual type.

    The macro FLT_EVAL_METHOD is set by the compiler to indicate the strategy:

    -1 indeterminable;

    0 evaluate all operations and constants just to the range and precision of the type;

    1 evaluate operations and constants of type float and double to the range and precision of the double type, evaluate long double operations and constants to the range and precision of the long double type;

    2 evaluate all operations and constants to the range and precision of the long double type.

    For historical reasons, two common choices when targeting the x86 processors are 0 and 2.

    File m.c is your first program. If I compile it, using my compiler, thus, I obtain:

    $ gcc -std=c99 -mfpmath=387 m.c
    $ ./a.out 
    float eps = 1.084202e-19
    $ gcc -std=c99  m.c
    $ ./a.out 
    float eps = 1.192093e-07
    

    If I compile this other program below, the compiler sets the macro according to what it does:

    #include <stdio.h>
    #include <float.h>
    
    int main(){
      printf("%d\n", FLT_EVAL_METHOD);
    }
    

    Results:

    $ gcc -std=c99 -mfpmath=387 t.c
    $ ./a.out 
    2
    $ gcc -std=c99 t.c
    $ ./a.out 
    0
    
    0 讨论(0)
提交回复
热议问题