问题
MSVC 2012 doesn't have the rint() function. For 32-bit, I'm using the following:
double rint(double x) {
__asm {
fld x
frndint
}
}
This doesn't work in x64. There's _mm_round_sd() but that requires SSE4. What is an efficient preferrably branchless way of getting the same behavior?
回答1:
rint 64-bit mode
#include <emmintrin.h>
static inline double rint (double const x) {
return (double)_mm_cvtsd_si32(_mm_load_sd(&x));
}
See Agner Fog's Optimizing C++ manual for lrint
32-bit mode
// Example 14.19
static inline int lrint (double const x) { // Round to nearest integer
int n;
#if defined(__unix__) || defined(__GNUC__)
// 32-bit Linux, Gnu/AT&T syntax:
__asm ("fldl %1 \n fistpl %0 " : "=m"(n) : "m"(x) : "memory" );
#else
// 32-bit Windows, Intel/MASM syntax:
__asm fld qword ptr x;
__asm fistp dword ptr n;
#endif
return n;
}
64-bit mode
// Example 14.21. // Only for SSE2 or x64
#include <emmintrin.h>
static inline int lrint (double const x) {
return _mm_cvtsd_si32(_mm_load_sd(&x));
}
Edit: I just realized that this method will limit the values to to +/- 2^31. If you want a version with a larger range with SSE2 it's complicated (but easy with SSE4.1). See the round function in Agner Fog's Vector Class in the file vectorf128.h for an example.
来源:https://stackoverflow.com/questions/21601806/implementing-rint-in-x86-64