How to make GCC generate bswap instruction for big endian store without builtins?

后端 未结 3 863
栀梦
栀梦 2020-12-05 04:55

I\'m working on a function that stores a 64-bit value into memory in big endian format. I was hoping that I could write portable C99 code that works on both little a

3条回答
  •  时光取名叫无心
    2020-12-05 05:50

    I like Peter's solution, but here's something else you can use on Haswell. Haswell has the movbe instruction, which is 3 uops there (no cheaper than bswap r64 + a normal load or store), but is faster on Atom / Silvermont (https://agner.org/optimize/):

    // AT&T syntax, compile without -masm=intel
    inline
    uint64_t load_bigend_u64(uint64_t value)
    {
        __asm__ ("movbe %[src], %[dst]"   // x86-64 only
                 :  [dst] "=r" (value)
                 :  [src] "m" (value)
                );
        return value;
    }
    

    Use it with something like uint64_t tmp = load_bigend_u64(array[i]);

    You could reverse this to make a store_bigend function, or use bswap to modify a value in a register and let the compiler load/store it.


    I change the function to return value because alignment of vdest was not clear to me.

    Usually a feature is guarded by a preprocessor macro. I'd expect __MOVBE__ to be used for the movbe feature flag, but its not present (this machine has the feature):

    $ gcc -march=native -dM -E - < /dev/null | sort
    ...
    #define __LWP__ 1
    #define __LZCNT__ 1
    #define __MMX__ 1
    #define __MWAITX__ 1
    #define __NO_INLINE__ 1
    #define __ORDER_BIG_ENDIAN__ 4321
    ...
    

提交回复
热议问题