I\'m trying to optimize my code using SSE intrinsics but am running into a problem where I don\'t know of a good way to extract the integer values from a vector after I\'ve
_mm_extract_epi32
The extract intrinsics is indeed the best option but if you need to support SSE2, I'd recommend this:
inline int get_x(const __m128i& vec){return _mm_cvtsi128_si32 (vec);}
inline int get_y(const __m128i& vec){return _mm_cvtsi128_si32 (_mm_shuffle_epi32(vec,0x55));}
inline int get_z(const __m128i& vec){return _mm_cvtsi128_si32 (_mm_shuffle_epi32(vec,0xAA));}
inline int get_w(const __m128i& vec){return _mm_cvtsi128_si32 (_mm_shuffle_epi32(vec,0xFF));}
I've found that if you reinterpret_cast/union the vector to any int[4] representation the compiler tends to flush things back to memory (which may not be that bad) and reads it back as an int, though I haven't looked at the assembly to see if the latest versions of the compilers generate better code.