SIMD (SSE) instruction for division in GCC

雨燕双飞 提交于 2019-12-04 07:45:26

I've used SIMD extension under windows, but have not yet under linux. That being said you should be able to take advantage of the DIVPS SSE operation which will divide a 4 float vector by another 4 float vector. But you are using doubles, so you'll want the SSE2 version DIVPD. I almost forgot, make sure to build with -msse2 switch.

I found a page which details some SSE GCC builtins. It looks kind of old, but should be a good start.

http://ds9a.nl/gcc-simd/

Is tmp.x *= 0.25; enough?

Note that for SSE instructions (in case that you want to use them) it's important that:

1) all the memory access is 16 bytes alighed

2) the operations are performed in a loop

3) no int <-> float or float <-> double conversions are performed

4) avoid divisions if possible

The intrinsic you are looking for is _mm_div_pd. Here is a working example which should be enough to steer you in the right direction:

#include <stdio.h>

#include <emmintrin.h>

typedef struct
{
    double x;
    double y;
    double z;
} v3d;

typedef union __attribute__ ((aligned(16)))
{
    v3d a;
    __m128d v[2];
} u3d;

int main(void)
{
    const __m128d vd = _mm_set1_pd(4.0);
    u3d u = { { 1.0, 2.0, 3.0 } };

    printf("v (before) = { %g %g %g }\n", u.a.x, u.a.y, u.a.z);

    u.v[0] = _mm_div_pd(u.v[0], vd);
    u.v[1] = _mm_div_pd(u.v[1], vd);

    printf("v (after) = { %g %g %g }\n", u.a.x, u.a.y, u.a.z);

    return 0;
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!