NEON, SSE and interleaving loads vs shuffles

吃可爱长大的小学妹 提交于 2019-12-01 01:48:26

According to this page:

The VLD3 intrinsic you need is:

int8x8x3_t  vld3_s8(__transfersize(24) int8_t const * ptr);
// VLD3.8 {d0, d1, d2}, [r0]

If at address pointed by ptr you have this data:

0x00: 33221100
0x04: 77665544
0x08: bbaa9988
0x0c: ffddccbb
0x10: 76543210
0x14: fedcba98

You will finally get in the registers:

d0: ba54ffbb99663300
d1: dc7610ccaa774411
d2: fe9832ddbb885522

The int8x8x3_t structure is defined as:

struct int8x8x3_t
{
   int8x8_t val[3];
};
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!