What are the gcc\'s intrinsic for loading 4 ints into __m128 and 8 ints into __m256 (aligned/unaligned)? What about unsigned int
Using Intel's SSE intrnisics, the ones you're looking for are:
_mm_load_si128()_mm_loadu_si128()_mm256_load_si256()_mm256_loadu_si256()Documentation:
There's no distinction between signed or unsigned. You'll need to cast the pointer to __m128i* or __m256i*.
Note that these are Intel's SSE intrinsics and will work in GCC, Clang, MSVC, and ICC.
The GCC intrinsics work only in, well, GCC AFAIK of.