Why does this bit of code,
const float x[16] = { 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8,
1.9, 2.0, 2.1, 2.2,
It's due to denormalized floating-point use. How to get rid of both it and the performance penalty? Having scoured the Internet for ways of killing denormal numbers, it seems there is no "best" way to do this yet. I have found these three methods that may work best in different environments:
Might not work in some GCC environments:
// Requires #include
fesetenv(FE_DFL_DISABLE_SSE_DENORMS_ENV);
Might not work in some Visual Studio environments: 1
// Requires #include
_mm_setcsr( _mm_getcsr() | (1<<15) | (1<<6) );
// Does both FTZ and DAZ bits. You can also use just hex value 0x8040 to do both.
// You might also want to use the underflow mask (1<<11)
Appears to work in both GCC and Visual Studio:
// Requires #include
// Requires #include
_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);
_MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON);
The Intel compiler has options to disable denormals by default on modern Intel CPUs. More details here
Compiler switches. -ffast-math
, -msse
or -mfpmath=sse
will disable denormals and make a few other things faster, but unfortunately also do lots of other approximations that might break your code. Test carefully! The equivalent of fast-math for the Visual Studio compiler is /fp:fast
but I haven't been able to confirm whether this also disables denormals.1