Using an union (encapsulated in a struct) to bypass conversions for neon data types

前端 未结 3 1956
失恋的感觉
失恋的感觉 2020-12-04 00:57

I made my first approach with vectorization intrinsics with SSE, where there is basically only one data type __m128i. Switching to Neon I found the data types a

3条回答
  •  醉梦人生
    2020-12-04 01:10

    Since the initial proposed method has undefined behaviour in C++, I have implemented something like this:

    template 
    struct NeonVectorType {
    
        private:
        T data;
    
        public:
        template 
        operator U () {
            BOOST_STATIC_ASSERT_MSG(sizeof(U) == sizeof(T),"Trying to convert to data type of different size");
            U u;
            memcpy( &u, &data, sizeof u );
            return u;
        }
    
        template 
        NeonVectorType& operator =(const U& in) {
            BOOST_STATIC_ASSERT_MSG(sizeof(U) == sizeof(T),"Trying to copy from data type of different size");
            memcpy( &data, &in, sizeof data );
            return *this;
        }
    
    };
    

    Then:

    typedef NeonVectorType uint_128bit_t; //suitable for uint8x16_t, uint8x8x2_t, uint32x4_t, etc.
    typedef NeonVectorType uint_64bit_t; //suitable for uint8x8_t, uint32x2_t, etc.
    

    The use of memcpy is discussed here (and here), and avoids breaking the strict aliasing rule. Note that in general it gets optimized away.

    If you look at the edit history, I had implemented a custom version with combine operators for vectors of vectors (e.g. uint8x8x2_t). The problem was mentioned here. However, since those data types are declared as arrays (see guide, section 12.2.2) and therefore located in consecutive memory locations, the compiler is bound to treat the memcpy correctly.

    Finally, to print the content of the variable one could use a function like this.

提交回复
热议问题