How to move double in %rax into particular qword position on %ymm or %zmm? (Kaby Lake or later)
The idea is that I'd like to collect returned values of double into a vector register for processing for machine imm width at a time without storing back into memory first. The particular processing is a vfma with other two operands that are all constexpr , so that they can simply be summoned by _mm256_setr_pd or aligned/unaligned memory load from constexpr array . Is there a way to store double in %ymm at particular position directly from value in %rax for collecting purpose? The target machine is Kaby Lake. More efficient of future vector instructions are welcome also. Inline-assembly is