Adding two floating-point numbers

前端 未结 2 1451
误落风尘
误落风尘 2020-12-06 10:04

I would like to compute the sum, rounded up, of two IEEE 754 binary64 numbers. To that end I wrote the C99 program below:

#include 
#include &         


        
2条回答
  •  眼角桃花
    2020-12-06 10:19

    clang or gcc -frounding-math tells them that code might run with a non-default rounding mode. It's not fully safe (it assumes the same rounding mode is active the whole time), but better than nothing. You might still need to use volatile to avoid CSE in some cases, or maybe the noinline wrapper trick from the other answer which in practice may work even better if you limit it to a single operation.


    As you noticed, GCC doesn't support #pragma STDC FENV_ACCESS ON. The default behaviour is like FENV_ACCESS OFF. Instead, you have to use command line options (or maybe per-function attributes) to control FP optimizations.

    As described in https://gcc.gnu.org/wiki/FloatingPointMath, -frounding-math is not on by default, so GCC assumes the default rounding mode when doing constant propagation and other optimizations at compile-time.

    But with gcc -O3 -frounding-math, constant propagation is blocked. Even if you don't call fesetround; what's actually happening is that GCC makes asm that's safe if the rounding mode had already been set to something else before main was even called.

    But unfortunately, as the wiki notes, GCC still assumes that the same rounding mode is in effect everywhere (GCC bug #34678). That means it will CSE two calculations of the same inputs before/after a call to fesetround, because it doesn't treat fesetround as special.

    #include 
    #pragma STDC FENV_ACCESS ON
    
    void foo(double *restrict out){
        out[0] = 0x1.0p0 + 0x1.0p-80;
        fesetround(FE_UPWARD);
        out[1] = 0x1.0p0 + 0x1.0p-80;
    }
    

    compiles as follows (Godbolt) with gcc10.2 (and essentially the same with clang10.1). Also includes your main, which does make the asm you want.

    foo:
            push    rbx
            mov     rbx, rdi
            sub     rsp, 16
            movsd   xmm0, QWORD PTR .LC1[rip]
            addsd   xmm0, QWORD PTR .LC0[rip]     # runtime add
            movsd   QWORD PTR [rdi], xmm0         # store out[0]
            mov     edi, 2048
            movsd   QWORD PTR [rsp+8], xmm0       # save a local temporary for later
            call    fesetround
            movsd   xmm0, QWORD PTR [rsp+8]
            movsd   QWORD PTR [rbx+8], xmm0       # store the same value, not recalc
            add     rsp, 16
            pop     rbx
            ret
    

    This is the same problem @Marc Glisse warned about in comments under the other answer in case your noinline function did the same math before and after changing the rounding mode.

    (And also that it's partly luck that GCC chose not to do the math before calling fesetround the first time, so it would only have to spill the result instead of both inputs. x86-64 System V doesn't have any call-preserved XMM regs.)

提交回复
热议问题