In GNU C inline asm, what are the size-override modifiers for xmm/ymm/zmm for a single operand?

前端 未结 2 568
没有蜡笔的小新
没有蜡笔的小新 2020-11-30 12:23

While trying to answer Embedded broadcasts with intrinsics and assembly, I was trying to do something like this:

__m512 mul_bcast(__m512 a, float b) {
    asm         


        
相关标签:
2条回答
  • 2020-11-30 12:50

    From the file gcc/config/i386/i386.c of the GCC sources:

           b -- print the QImode name of the register for the indicated operand.
            %b0 would print %al if operands[0] is reg 0.
           w --  likewise, print the HImode name of the register.
           k --  likewise, print the SImode name of the register.
           q --  likewise, print the DImode name of the register.
           x --  likewise, print the V4SFmode name of the register.
           t --  likewise, print the V8SFmode name of the register.
           g --  likewise, print the V16SFmode name of the register.
           h -- print the QImode name for a "high" register, either ah, bh, ch or dh.
    

    Similarly from gcc/config/i386/contraints.md:

        ;; We use the Y prefix to denote any number of conditional register sets:
        ;;  z   First SSE register.
        ;;  i   SSE2 inter-unit moves to SSE register enabled
        ;;  j   SSE2 inter-unit moves from SSE register enabled
        ;;  m   MMX inter-unit moves to MMX register enabled
        ;;  n   MMX inter-unit moves from MMX register enabled
        ;;  a   Integer register when zero extensions with AND are disabled
        ;;  p   Integer register when TARGET_PARTIAL_REG_STALL is disabled
        ;;  f   x87 register when 80387 floating point arithmetic is enabled
        ;;  r   SSE regs not requiring REX prefix when prefixes avoidance is enabled
        ;;  and all SSE regs otherwise
    

    This file also defines a "Yk" constraint but I don't know if how well it would work in an asm statement:

        (define_register_constraint "Yk" "TARGET_AVX512F ? MASK_EVEX_REGS : NO_REGS"
        "@internal Any mask register that can be used as predicate, i.e. k1-k7.")
    

    Note this is all copied from the latest SVN revision. I don't know what release of GCC, if any, the particular modifiers and constraints you're interested in were added.

    0 讨论(0)
  • 2020-11-30 13:15

    It seems like all recent versions of GCC will accept both 'q' and 'x' as modifiers to print the XMM version of a YMM register.

    Intel's icc looks to accept 'q', but not 'x' (at least through version 13.0.1).

    [Edit: Well, it worked in this small example below, but in a real test case, I'm having problems with icc 14.0.3 accepting the 'q' but writing a 'ymm'.]

    [Edit: Testing with more recent versions of icc, I'm finding that neither icc 15 nor icc 16 work with either 'q' or 'x'.]

    But Clang 3.6 and earlier accept neither syntax. And at least on Godbolt, Clang 3.7 crashes with both!

    // inline assembly modifiers to convert ymm to xmm
    
    #include <x86intrin.h>
    #include <stdint.h>
    
    // gcc also accepts "%q1" as "%x1" 
    // icc accepts "%q1" but not "%x1"
    // clang-3.6 accepts neither
    // clang-3.7 crashes with both!
    
    #define ASM_MOVD(vec, reg)       \
    __asm volatile("vmovd %q1, %0" : \
                   "=r" (reg) :      \
                   "x" (vec)         \
        );          
    
    uint32_t movd_ymm(__m256i ymm) {
       uint32_t low;
       ASM_MOVD(ymm, low);
       return low;
    }
    
    uint32_t movd_xmm(__m128i xmm) {
       uint32_t low;
       ASM_MOVD(xmm, low);
       return low;
    }
    

    Link to test on Godbolt: http://goo.gl/bOkjNu

    (Sorry that this isn't full answer to your question, but it seemed like useful information to share and was too long for a comment)

    0 讨论(0)
提交回复
热议问题