While trying to answer Embedded broadcasts with intrinsics and assembly, I was trying to do something like this:
__m512 mul_bcast(__m512 a, float b) {
asm
It seems like all recent versions of GCC will accept both 'q' and 'x' as modifiers to print the XMM version of a YMM register.
Intel's icc looks to accept 'q', but not 'x' (at least through version 13.0.1).
[Edit: Well, it worked in this small example below, but in a real test case, I'm having problems with icc 14.0.3 accepting the 'q' but writing a 'ymm'.]
[Edit: Testing with more recent versions of icc, I'm finding that neither icc 15 nor icc 16 work with either 'q' or 'x'.]
But Clang 3.6 and earlier accept neither syntax. And at least on Godbolt, Clang 3.7 crashes with both!
// inline assembly modifiers to convert ymm to xmm
#include
#include
// gcc also accepts "%q1" as "%x1"
// icc accepts "%q1" but not "%x1"
// clang-3.6 accepts neither
// clang-3.7 crashes with both!
#define ASM_MOVD(vec, reg) \
__asm volatile("vmovd %q1, %0" : \
"=r" (reg) : \
"x" (vec) \
);
uint32_t movd_ymm(__m256i ymm) {
uint32_t low;
ASM_MOVD(ymm, low);
return low;
}
uint32_t movd_xmm(__m128i xmm) {
uint32_t low;
ASM_MOVD(xmm, low);
return low;
}
Link to test on Godbolt: http://goo.gl/bOkjNu
(Sorry that this isn't full answer to your question, but it seemed like useful information to share and was too long for a comment)