Which instructions would be used for comparing two 128 bit vectors consisting of 4 * 32-bit floating point values?
Is there an instruction that considers a NaN value
Here is one possible solution - it's not very efficient however, requiring 6 instructions:
__m128 v0, v1; // float vectors
__m128 v0nan = _mm_cmpeq_ps(v0, v0); // test v0 for NaNs
__m128 v1nan = _mm_cmpeq_ps(v1, v1); // test v1 for NaNs
__m128 vnan = _mm_or_si128(v0nan, v1nan); // combine
__m128 vcmp = _mm_cmpneq_ps(v0, v1); // compare floats
vcmp = _mm_and_si128(vcmp, vnan); // combine NaN test
bool cmp = _mm_testz_si128(vcmp, vcmp); // return true if all equal
Note that all the logic above is inverted, which may make the code a little hard to follow (OR
s are effectively AND
s, and vice versa).