How can I write a QuadWord from AVX512 register zmm26 to the rax register?

前端 未结 2 1220
深忆病人
深忆病人 2021-01-17 23:40

I wish to perform integer arithmetic operations on Quad Word elements of the zmm 0-31 register set and preserve the carry bit resulting from those operations. It appears th

2条回答
  •  慢半拍i
    慢半拍i (楼主)
    2021-01-18 00:37

    Unlike some of the earlier SIMD extensions which had the "extract" instructions such as pextrq which would do this directly, I'm not aware of any way to do it in AVX-512 (nor in AVX with ymm registers) other than:

    1. Permuting/shuffling the element you want into the lower order quadword and then using vmovq as you noted to get it into a general purpose register.

    2. Storing the entire vector to a temporary memory location loc, such as the stack, then using mov register,[loc + offset] instructions to read whichever qword(s) you are interested in.

    Both approaches seem pretty ugly, and which is better depends on your exact scenario. Despite using memory as an intermediary, the second approach may be faster if you plan to extract several values from each vector since you can make use of both load ports on recent CPUs which have throughput of one load/cycle, while the permute/shuffle approach is likely to bottleneck a on the port required for the permute/shuffle.

    See Peter's answer below for a more comprehensive treatment, including using the vcompress instructions with a mask as a kind of poor-man's extract.

提交回复
热议问题