Using an union (encapsulated in a struct) to bypass conversions for neon data types

前端未结

关注

 3  1956

失恋的感觉 2020-12-04 00:57

I made my first approach with vectorization intrinsics with SSE, where there is basically only one data type __m128i. Switching to Neon I found the data types a

3条回答

醉梦人生 (楼主)

2020-12-04 01:10
Since the initial proposed method has undefined behaviour in C++, I have implemented something like this:
```
template 
struct NeonVectorType {

    private:
    T data;

    public:
    template 
    operator U () {
        BOOST_STATIC_ASSERT_MSG(sizeof(U) == sizeof(T),"Trying to convert to data type of different size");
        U u;
        memcpy( &u, &data, sizeof u );
        return u;
    }

    template 
    NeonVectorType& operator =(const U& in) {
        BOOST_STATIC_ASSERT_MSG(sizeof(U) == sizeof(T),"Trying to copy from data type of different size");
        memcpy( &data, &in, sizeof data );
        return *this;
    }

};
```
Then:
```
typedef NeonVectorType uint_128bit_t; //suitable for uint8x16_t, uint8x8x2_t, uint32x4_t, etc.
typedef NeonVectorType uint_64bit_t; //suitable for uint8x8_t, uint32x2_t, etc.
```
The use of memcpy is discussed here (and here), and avoids breaking the strict aliasing rule. Note that in general it gets optimized away.

If you look at the edit history, I had implemented a custom version with combine operators for vectors of vectors (e.g. uint8x8x2_t). The problem was mentioned here. However, since those data types are declared as arrays (see guide, section 12.2.2) and therefore located in consecutive memory locations, the compiler is bound to treat the memcpy correctly.

Finally, to print the content of the variable one could use a function like this.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...