MinGW GCC 4.9.1 and floating-point determinism

问题

I wrote a small program to compute the Euclidean norm of a 3-coordinate vector. Here it is:

#include <array>
#include <cmath>
#include <iostream>

template<typename T, std::size_t N>
auto norm(const std::array<T, N>& arr)
    -> T
{
    T res{};
    for (auto value: arr)
    {
        res += value * value;
    }
    return std::sqrt(res);
}

int main()
{
    std::array<double, 3u> arr = { 4.0, -2.0, 6.0 };
    std::cout << norm(arr) - norm(arr) << '\n';
}

On my computer, it prints -1.12323e-016.

I know that floating point types should be handled with care. However, I thought that floating-point operations were at least somehow deterministic. This article about floating-point determinism states that:

Some of the things that are guaranteed are the results of addition, subtraction, multiplication, division, and square root. The results of these operations are guaranteed to be the exact result correctly rounded (more on that later) so if you supply the same input value(s), with the same global settings, and the same destination precision you are guaranteed the same result.

As you can see, the only operations that this program does on floating-point values are addition, subtraction, multiplication and square root. If I trust the article I quoted above, considering that it runs in a single thread and that I do not change the rounding modes or anything else floating-point related, I thought that norm(arr) - norm(arr) would be 0 since I do exactly the same operations on the same values twice.

Are my assumptions wrong, or is this a case of compiler not strictly conformant with regard to IEEE floating-point math? I am currently using MinGW-W64 GCC 4.9.1 32 bits (I tried every optimization level from -O0 to -O3). The same program with MinGW-W64 GCC 4.8.x displayed 0, which is what I would have expected.

EDIT: I disassembled the code. I won't post the whole generated assembly because it is too big. However, I believe that the relevant part is here:

call    ___main
fldl    LC0
fstpl   -32(%ebp)
fldl    LC1
fstpl   -24(%ebp)
fldl    LC2
fstpl   -16(%ebp)
leal    -32(%ebp), %eax
movl    %eax, (%esp)
call    __Z4normIdLj3EET_RKSt5arrayIS0_XT0_EE
fstpl   -48(%ebp)
leal    -32(%ebp), %eax
movl    %eax, (%esp)
call    __Z4normIdLj3EET_RKSt5arrayIS0_XT0_EE
fsubrl  -48(%ebp)
fstpl   (%esp)
movl    $__ZSt4cout, %ecx
call    __ZNSolsEd
subl    $8, %esp
movl    $10, 4(%esp)
movl    %eax, (%esp)
call    __ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_c
movl    $0, %eax
movl    -4(%ebp), %ecx
.cfi_def_cfa 1, 0
leave

As you can see, __Z4normIdLj3EET_RKSt5arrayIS0_XT0_EE is called twice, therefore, it is not inlined. I don't understand the whole thing though, and can't tell what is the problem.

回答1:

As @MatthiasB pointed out, this seems to be an issue of gcc temporarily storing an 80 bit floating point value into a 64 bit register/memory location. Consider the following simplified program which still reproduces the issue:

#include <cmath>
#include <iostream>

double norm() {
    double res = 4.0 * 4.0 + (-2.0 * -2.0) + (6.0 * 6.0);
    return std::sqrt(res);
}

int main() {
    std::cout << norm() - norm() << '\n';
    return 0;
}

The assembly code of the essential part norm() - norm() looks like this (using 32 bit mingw 4.8.0 compiler)

...
call    __Z4normv     ; call norm()
fstpl   -16(%ebp)     ; store result (80 bit) in temporary (64 bit!)
call    __Z4normv     ; call norm() again
fsubrl  -16(%ebp)     ; subtract result (80 bit) from temporary (64 bit!)
...

Essentially, I would consider this a gcc bug, but it seems to be a complicated topic ...

回答2:

There is a difference in precision of a floating point number depending on where it is stored. If the compiler keeps one variable in a register, it has higher precision as a variable stored in memory. You can try to force your variables to be stored in memory before, like:

int main()
{
    std::array<double, 3u> arr = { 4.0, -2.0, 6.0 };
    volatile double v1 = norm(arr);
    volatile double v2 = norm(arr);
    std::cout << v1 - v2 << '\n';
}

This gives you the expected result of 0.

来源：https://stackoverflow.com/questions/25891379/mingw-gcc-4-9-1-and-floating-point-determinism

标签

c++

c++11

floating-point

mingw-w64

gcc4.9