indexing into an array with SSE

主宰稳场 提交于 2019-11-30 07:10:26

问题


Suppose I have an array:

uint8_t arr[256];

and an element

__m128i x

containing 16 bytes,

x_1, x_2, ... x_16

I would like to efficiently fill a new __m128i element

__m128i y

with values from arr depending on the values in x, such that:

y_1  = arr[x_1]
y_2  = arr[x_2]
   .
   .
   .
y_16 = arr[x_16]

A command to achieve this would essentially be loading a register from a non-contiguous set of memory locations. I have a painfully vague memory of having seen documentation of such a command, but can't find it now. Does it exist? Thanks in advance for your help.


回答1:


This kind of capability in SIMD architectures is known as load/store scatter/gather. Unfortunately SSE does not have it. Future SIMD architectures from Intel may have this - the ill-fated Larrabee processor was one case in point. For now though you will just need to design your data structures in such a way that this kind of functionality is not needed.

Note that you can achieve the equivalent effect by using e.g. _mm_set_epi8:

y = _mm_set_epi8(arr[x_16], arr[x_15], arr[x_14], ..., arr[x_1]);

although of course this will just generate a bunch of scalar code to load your y vector. This is fine if you are doing this kind of operation outside any performance-critical loops, e.g. as part of initialisation prior to looping, but inside a loop it is likely to be a performance-killer.



来源:https://stackoverflow.com/questions/4483828/indexing-into-an-array-with-sse

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!