Compare the sign bit in SSE Intrinsics

杀马特。学长 韩版系。学妹 提交于 2019-12-13 16:52:43

问题


How would one create a mask using SSE intrinsics which indicates whether the signs of two packed floats (__m128's) are the same for example if comparing a and b where a is [1.0 -1.0 0.0 2.0] and b is [1.0 1.0 1.0 1.0] the desired mask we would get is [true false true true].


回答1:


Here's one solution:

const __m128i MASK = _mm_set1_epi32(0xffffffff);

__m128 a = _mm_setr_ps(1,-1,0,2);
__m128 b = _mm_setr_ps(1,1,1,1);

__m128  f = _mm_xor_ps(a,b);
__m128i i = _mm_castps_si128(f);

i = _mm_srai_epi32(i,31);
i = _mm_xor_si128(i,MASK);

f = _mm_castsi128_ps(i);

//  i = (0xffffffff, 0, 0xffffffff, 0xffffffff)
//  f = (0xffffffff, 0, 0xffffffff, 0xffffffff)

In this snippet, both i and f will have the same bitmask. I assume you want it in the __m128 type so I added the f = _mm_castsi128_ps(i); to convert it back from an __m128i.

Note that this code is sensitive to the sign of the zero. So 0.0 and -0.0 will affect the results.


Explanations:

The way the code works is as follows:

f = _mm_xor_ps(a,b);       //  xor the sign bits (well all the bits actually)

i = _mm_castps_si128(f);   //  Convert it to an integer. There's no instruction here.

i = _mm_srai_epi32(i,31);  //  Arithmetic shift that sign bit into all the bits.

i = _mm_xor_si128(i,MASK); //  Invert all the bits

f = _mm_castsi128_ps(i);   //  Convert back. Again, there's no instruction here.



回答2:


Have a look at the _mm_movemask_ps instruction, which extracts the most significant bit (i.e. sign bit) from 4 floats. See http://msdn.microsoft.com/en-us/library/4490ys29.aspx

For example, if you have [1.0 -1.0 0.0 2.0], then movemask_ps will return 4, or 0100 in binary. So then if you get movemask_ps for each vector and compare the results (perhaps bitwise NOT XOR), then that will indicate whether all the signs are the same.

a = [1.0 -1.0 0.0 2.0]
b = [1.0 1.0 1.0 1.0]
movemask_ps a = 4
movemask_ps b = 0
NOT (a XOR b) = 0xB, or binary 1011

Hence signs are the same except in the second vector element.



来源:https://stackoverflow.com/questions/8440764/compare-the-sign-bit-in-sse-intrinsics

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!