Using __m256d registers

后端未结

关注

 1  1970

耶瑟儿～ 2021-02-08 01:16

How do you use __m256d?

Say I want to use the Intel AVX instruction _mm256_add_pd on a simple Vector3 class with 3-64 bit double p

1条回答

轮回少年 (楼主)

2021-02-08 01:56
First I'd like to clear up a little confusion. __m256d isn't a type of register, it's a data type that can be loaded into an AVX register. A __m256d is no more a register than an int is a register. There are a few ways to get data in and out of an __m256d (or any other vector type):

Using a union: Yes, the union trick works. It works very well, since the union will generally have the correct alignment (although malloc might not, use posix_memalign or _aligned_malloc).
```
class Vector3 {
public:
    Vector3(double xx, double yy, double zz);
    Vector3(__m256d vvec);


    Vector3 operator+(const Vector3 &other) const
    {
        return Vector3(_mm256_add_pd(vec, other.vec));
    }

    union {
        struct {
            double x, y, z;
        };
        __m256d vec; // a data field, maybe a register, maybe not
    };
};
```
Using intrinsics: Inside a function, it's usually easier to use intrinsics to get data in and out of a vector type.
```
__m256d vec = ...;
double x, y, z;
vec = _mm256_add_pd(vec, _mm256_set_pd(x, y, z, 0.0));
```
Using pointer casts: Casting pointers is the last resort for a couple of reasons.
1. The pointer might not be aligned correctly.
2. Casting pointers can sometimes mess with the compiler's aliasing analysis.
3. Pointer casting bypasses a number of safety guarantees.
So I'd only use pointer casting to plow through a big array of data.
0 讨论(0)
发布评论:

提交评论
- 加载中...