Implementing rint() in x86-64

问题

MSVC 2012 doesn't have the rint() function. For 32-bit, I'm using the following:

double rint(double x) {
    __asm {
        fld x
        frndint
    }
}

This doesn't work in x64. There's _mm_round_sd() but that requires SSE4. What is an efficient preferrably branchless way of getting the same behavior?

回答1:

rint 64-bit mode

#include <emmintrin.h>

static inline double rint (double const x) {
    return (double)_mm_cvtsd_si32(_mm_load_sd(&x));
}

See Agner Fog's Optimizing C++ manual for lrint

32-bit mode

// Example 14.19
static inline int lrint (double const x) { // Round to nearest integer
    int n;
    #if defined(__unix__) || defined(__GNUC__)
    // 32-bit Linux, Gnu/AT&T syntax:
    __asm ("fldl %1 \n fistpl %0 " : "=m"(n) : "m"(x) : "memory" );
    #else
    // 32-bit Windows, Intel/MASM syntax:
    __asm fld qword ptr x;
    __asm fistp dword ptr n;
    #endif
    return n;
}

64-bit mode

// Example 14.21. // Only for SSE2 or x64
#include <emmintrin.h>

static inline int lrint (double const x) {
    return _mm_cvtsd_si32(_mm_load_sd(&x));
}

Edit: I just realized that this method will limit the values to to +/- 2^31. If you want a version with a larger range with SSE2 it's complicated (but easy with SSE4.1). See the round function in Agner Fog's Vector Class in the file vectorf128.h for an example.

来源：https://stackoverflow.com/questions/21601806/implementing-rint-in-x86-64

标签

visual-c++

math

floating-point

sse

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!