What is the most efficient way to clear a single or a few ZMM registers on Knights Landing?
问题 Say, I want to clear 4 zmm registers. Will the following code provide the fastest speed? vpxorq zmm0, zmm0, zmm0 vpxorq zmm1, zmm1, zmm1 vpxorq zmm2, zmm2, zmm2 vpxorq zmm3, zmm3, zmm3 On AVX2, if I wanted to clear ymm registers, vpxor was fastest, faster than vxorps, since vpxor could run on multiple units. On AVX512, we don't have vpxor for zmm registers, only vpxorq and vpxord. Is that an efficient way to clear a register? Is the CPU smart enough to not make false dependencies on previous