Convert _mm_shuffle_epi32 to C expression for the permutation?

China☆狼群 提交于 2019-11-27 16:15:48

There's no AND/OR going on, unless you need to unpack the 8bit integer holding four 2bit indices.

Make your own definition for _MM_SHUFFLE that expands to four args, instead of packing them.

It's something like

// dst = _mm_shuffle_epi32(src, _MM_SHUFFLE(d,c,b,a))
void pshufd(int dst[4], int src[4], int d,int c,int b,int a)
{   // note that the _MM_SHUFFLE args are high-element-first order
    dst[0] = src[a];
    dst[1] = src[b];
    dst[2] = src[c];
    dst[3] = src[d];
}

Vectors are indexed from low element = 0. The low element is the one that stores into memory at the lowest address, but when values are in registers you should think about them as [ 3 2 1 0 ]. In this notation, vector right-shifts (like psrldq) actually shift to the right.

This is why _mm_set_epi32(3, 2, 1, 0) takes its args in reverse order from int foo[] = { 0, 1, 2, 3 };.

When it's not clear what exactly some intrinsic is doing a few sample runs with simple inputs might help as well:

int x[] = {0,1,2,3}, y[4];
__m128i s = _mm_shuffle_epi32(_mm_loadu_si128((__m128i*)x), _MM_SHUFFLE(2, 3, 0, 1));
_mm_store_si128((__m128i*)y, s);
printf("{%d,%d,%d,%d} => {%d,%d,%d,%d}\n", x[0], x[1], x[2], x[3], y[0], y[1], y[2], y[3]);

This prints:

{0,1,2,3} => {1,0,3,2}

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!