Looking at the SSE operators
CMPORDPS - ordered compare packed singles
CMPUNORDPS - unordered compare packed singles
What do ordered and unordered mean? I looked for equivalent instructions in the x86 instruction set, and it only seems to have unordered (FUCOM).
An ordered comparison checks if neither operand is NaN
. Conversely, an unordered comparison checks if either operand is a NaN
.
This page gives some more information on this:
The idea here is that comparisons with NaN
are indeterminate. (can't decide the result) So an ordered/unordered comparison checks if this is (or isn't) the case.
double a = 0.;
double b = 0.;
__m128d x = _mm_set1_pd(a / b); // NaN
__m128d y = _mm_set1_pd(1.0); // 1.0
__m128d z = _mm_set1_pd(1.0); // 1.0
__m128d c0 = _mm_cmpord_pd(x,y); // NaN vs. 1.0
__m128d c1 = _mm_cmpunord_pd(x,y); // NaN vs. 1.0
__m128d c2 = _mm_cmpord_pd(y,z); // 1.0 vs. 1.0
__m128d c3 = _mm_cmpunord_pd(y,z); // 1.0 vs. 1.0
__m128d c4 = _mm_cmpord_pd(x,x); // NaN vs. NaN
__m128d c5 = _mm_cmpunord_pd(x,x); // NaN vs. NaN
cout << _mm_castpd_si128(c0).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c1).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c2).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c3).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c4).m128i_i64[0] << endl;
cout << _mm_castpd_si128(c5).m128i_i64[0] << endl;
Result:
0
-1
-1
0
0
-1
- Ordered comparison of
NaN
and1.0
givesfalse
. - Unordered comparison of
NaN
and1.0
givestrue
. - Ordered comparison of
1.0
and1.0
givestrue
. - Unordered comparison of
1.0
and1.0
givesfalse
. - Ordered comparison of
NaN
andNan
givesfalse
. - Unordered comparison of
NaN
andNaN
givestrue
.
This Intel guide: http://intel80386.com/simd/mmx2-doc.html contains examples of the two which are fairly straight-forward:
CMPORDPS Compare Ordered Parallel Scalars
Opcode Cycles Instruction 0F C2 .. 07 2 (3) CMPORDPS xmm reg,xmm reg/mem128
CMPORDPS op1, op2
op1 contains 4 single precision 32-bit floating point values op2 contains 4 single precision 32-bit floating point values
op1[0] = (op1[0] != NaN) && (op2[0] != NaN) op1[1] = (op1[1] != NaN) && (op2[1] != NaN) op1[2] = (op1[2] != NaN) && (op2[2] != NaN) op1[3] = (op1[3] != NaN) && (op2[3] != NaN) TRUE = 0xFFFFFFFF FALSE = 0x00000000
CMPUNORDPS Compare Unordered Parallel Scalars
Opcode Cycles Instruction 0F C2 .. 03 2 (3) CMPUNORDPS xmm reg,xmm reg/mem128
CMPUNORDPS op1, op2
op1 contains 4 single precision 32-bit floating point values op2 contains 4 single precision 32-bit floating point values
op1[0] = (op1[0] == NaN) || (op2[0] == NaN) op1[1] = (op1[1] == NaN) || (op2[1] == NaN) op1[2] = (op1[2] == NaN) || (op2[2] == NaN) op1[3] = (op1[3] == NaN) || (op2[3] == NaN) TRUE = 0xFFFFFFFF FALSE = 0x00000000
The difference is AND (ordered) vs OR (unordered).
TL:DR: Unordered is a relation two FP values can have. The "Unordered" in FUCOM
means it doesn't raise an FP exception when the comparison result is unordered, while FCOM
does. This is the same as the distinction between OQ and OS cmpps
predicates
ORD and UNORD are two choices of predicate for the cmppd
/ cmpps
/ cmpss
/ cmpsd
insns (full tables in the cmppd
entry which is alphabetically first). That html extract has readable table formatting, but Intel's official PDF original is somewhat better. (See the x86 tag wiki for links).
Two floating point operands are ordered with respect to each other if neither is NaN. They're unordered if either is NaN. i.e. ordered = (x>y) | (x==y) | (x<y);
. That's right, with floating point it's possible for none of those things to be true. For more Floating Point madness, see Bruce Dawson's excellent series of articles.
cmpps
takes a predicate and produces a vector of results, instead of doing a comparison between two scalars and setting flags so you can check any predicate you want after the fact. So it needs specific predicates for everything you can check.
The scalar equivalent is comiss
/ ucomiss
to set ZF/PF/CF from the FP comparison result (which works like the x87 compare instructions (see the last section of this answer), but on the low element of XMM regs).
To check for unordered, look at PF
. If the comparison is ordered, you can look at the other flags to see whether the operands were greater, equal, or less (using the same conditions as for unsigned integers, like jae
for Above or Equal).
The COMISS instruction differs from the UCOMISS instruction in that it signals a SIMD floating-point invalid operation exception (#I) when a source operand is either a QNaN or SNaN. The UCOMISS instruction signals an invalid numeric exception only if a source operand is an SNaN.
Normally FP exceptions are masked, so this doesn't actually interrupt your program; it just sets the bit in the MXCSR which you can check later.
This is the same as O/UQ vs. O/US flavours of predicate for cmpps
/ vcmpps
. The AVX version of the cmp[ps][sd]
instructions have an expanded choice of predicate, so they needed a naming convention to keep track of them.
The O vs. U tells you whether the predicate is true when the operands are unordered.
The Q vs. S tells you whether #I will be raised if either operand is a Quiet NaN. #I will always be raised if either operand is a Signalling NaN, but those are not "naturally occurring". You don't get them as outputs from other operations, only by creating the bit pattern yourself (e.g. as an error-return value from a function, to ensure detection of problems later).
The x87 equivalent is using fcom
or fucom
to set the FPU status word -> fstsw ax
-> sahf
, or preferably fucomi
to set EFLAGS directly like comiss
.
The U / non-U distinction is the same with x87 instructions as for comiss
/ ucomiss
Perhaps this page on Visual C++ intrinsics can be of help? :)
CMPORDPS
r0 := (a0 ord? b0) ? 0xffffffff : 0x0
r1 := (a1 ord? b1) ? 0xffffffff : 0x0
r2 := (a2 ord? b2) ? 0xffffffff : 0x0
r3 := (a3 ord? b3) ? 0xffffffff : 0x0
CMPUNORDPS
r0 := (a0 unord? b0) ? 0xffffffff : 0x0
r1 := a1 ; r2 := a2 ; r3 := a3
来源:https://stackoverflow.com/questions/8627331/what-does-ordered-unordered-comparison-mean