fast-math | 易学教程

Negative NaN is not a NaN?

阅读更多关于 Negative NaN is not a NaN?

问题 While writing some test cases, and some of the tests check for the result of a NaN. I tried using std::isnan but the assert failes: Assertion `std::isnan(x)' failed. After printing the value of x , it turned out it's negative NaN ( -nan ) which is totally acceptable in my case. After trying to use the fact that NaN != NaN and using assert(x == x) , the compiler does me a 'favor' and optimises the assert away. Making my own isNaN function is being optimised away as well. How can I check for

What does gcc's ffast-math actually do?

阅读更多关于 What does gcc's ffast-math actually do?

问题 I understand gcc's --ffast-math flag can greatly increase speed for float ops, and goes outside of IEEE standards, but I can't seem to find information on what is really happening when it's on. Can anyone please explain some of the details and maybe give a clear example of how something would change if the flag was on or off? I did try digging through S.O. for similar questions but couldn't find anything explaining the workings of ffast-math. 回答1: As you mentioned, it allows optimizations

Is there a -ffast-math flag equivalent for the Visual Studio C++ compiler

阅读更多关于 Is there a -ffast-math flag equivalent for the Visual Studio C++ compiler

问题 I'm working with the default C++ compiler (I guess it's called the "Visual Studio C++ compiler") that comes with Visual Studio 2013 with the flag /Ox (Full Optimization). Due to floating point side effects, I must disable the -ffast-math flag when using the gcc compiler. Is there an equivalent option for this flag in the configuration of the Visual Studio C++ compiler? 回答1: You are looking for /fp:precise , although that is also the default. If you need the strictest floating point

Auto vectorization on double and ffast-math

阅读更多关于 Auto vectorization on double and ffast-math

问题 Why is it mandatory to use -ffast-math with g++ to achieve the vectorization of loops using double s? I don't like -ffast-math because I don't want to lose precision. 回答1: You don’t necessarily lose precision with -ffast-math . It only affects the handling of NaN , Inf etc. and the order in which operations are performed. If you have a specific piece of code where you do not want GCC to reorder or simplify computations, you can mark variables as being used using an asm statement. For instance

gcc, simd intrinsics and fast-math concepts

阅读更多关于 gcc, simd intrinsics and fast-math concepts

问题 Hi all :) I'm trying to get a hang on a few concepts regarding floating point, SIMD/math intrinsics and the fast-math flag for gcc. More specifically, I'm using MinGW with gcc v4.5.0 on a x86 cpu. I've searched around for a while now, and that's what I (think I) understand at the moment: When I compile with no flags, any fp code will be standard x87, no simd intrinsics, and the math.h functions will be linked from msvcrt.dll. When I use mfpmath , mssen and/or march so that mmx/sse/avx code

Why doesn't GCC optimize aaaaaa to (aaa)(aaa)?

阅读更多关于 Why doesn't GCC optimize a*a*a*a*a*a to (a*a*a)*(a*a*a)?

I am doing some numerical optimization on a scientific application. One thing I noticed is that GCC will optimize the call pow(a,2) by compiling it into a*a , but the call pow(a,6) is not optimized and will actually call the library function pow , which greatly slows down the performance. (In contrast, Intel C++ Compiler , executable icc , will eliminate the library call for pow(a,6) .) What I am curious about is that when I replaced pow(a,6) with a*a*a*a*a*a using GCC 4.5.1 and options " -O3 -lm -funroll-loops -msse4 ", it uses 5 mulsd instructions: movapd %xmm14, %xmm13 mulsd %xmm14, %xmm13

Mingw32 std::isnan with -ffast-math

阅读更多关于 Mingw32 std::isnan with -ffast-math

I am compiling the following code with the -ffast-math option: #include <limits> #include <cmath> #include <iostream> int main() { std::cout << std::isnan(std::numeric_limits<double>::quiet_NaN() ) << std::endl; } I am getting 0 as output. How can my code tell whether a floating point number is NaN when it is compiled with -ffast-math ? Note: On linux, std::isnan works even with -ffast-math. Since -ffast-math instructs GCC not to handle NaN s, it is expected that isnan() has an undefined behaviour. Returning 0 is therefore valid. You can use the following fast replacement for isnan() : #if

std::isinf does not work with -ffast-math. how to check for infinity

阅读更多关于 std::isinf does not work with -ffast-math. how to check for infinity

Sample code: #include <iostream> #include <cmath> #include <stdint.h> using namespace std; static bool my_isnan(double val) { union { double f; uint64_t x; } u = { val }; return (u.x << 1) > 0x7ff0000000000000u; } int main() { cout << std::isinf(std::log(0.0)) << endl; cout << std::isnan(std::sqrt(-1.0)) << endl; cout << my_isnan(std::sqrt(-1.0)) << endl; cout << __isnan(std::sqrt(-1.0)) << endl; return 0; } Online compiler . With -ffast-math , that code prints "0, 0, 1, 1" -- without, it prints "1, 1, 1, 1". Is that correct? I thought that std::isinf / std::isnan should still work with -ffast

Negative NaN is not a NaN?

阅读更多关于 Negative NaN is not a NaN?

While writing some test cases, and some of the tests check for the result of a NaN. I tried using std::isnan but the assert failes: Assertion `std::isnan(x)' failed. After printing the value of x , it turned out it's negative NaN ( -nan ) which is totally acceptable in my case. After trying to use the fact that NaN != NaN and using assert(x == x) , the compiler does me a 'favor' and optimises the assert away. Making my own isNaN function is being optimised away as well. How can I check for both equality of NaN and -NaN? This is embarrassing. The reason the compiler (GCC in this case) was

Does any floating point-intensive code produce bit-exact results in any x86-based architecture?

阅读更多关于 Does any floating point-intensive code produce bit-exact results in any x86-based architecture?

I would like to know if any code in C or C++ using floating point arithmetic would produce bit exact results in any x86 based architecture, regardless of the complexity of the code. To my knowledge, any x86 architecture since the Intel 8087 uses a FPU unit prepared to handle IEEE-754 floating point numbers, and I cannot see any reason why the result would be different in different architectures. However, if they were different (namely due to different compiler or different optimization level), would there be some way to produce bit-exact results by just configuring the compiler? Table of