How to emulate _mm256_loadu_epi32 with gcc or clang?
问题 Intel's intrinsic guide lists the intrinsic _mm256_loadu_epi32: _m256i _mm256_loadu_epi32 (void const* mem_addr); /* Instruction: vmovdqu32 ymm, m256 CPUID Flags: AVX512VL + AVX512F Description Load 256-bits (composed of 8 packed 32-bit integers) from memory into dst. mem_addr does not need to be aligned on any particular boundary. Operation a[255:0] := MEM[mem_addr+255:mem_addr] dst[MAX:256] := 0 */ But clang and gcc do not provide this intrinsic. Instead they provide (in file avx512vlintrin