I\'m working on a function that stores a 64-bit value into memory in big endian format. I was hoping that I could write portable C99 code that works on both little a
I like Peter's solution, but here's something else you can use on Haswell. Haswell has the movbe
instruction, which is 3 uops there (no cheaper than bswap r64
+ a normal load or store), but is faster on Atom / Silvermont (https://agner.org/optimize/):
// AT&T syntax, compile without -masm=intel
inline
uint64_t load_bigend_u64(uint64_t value)
{
__asm__ ("movbe %[src], %[dst]" // x86-64 only
: [dst] "=r" (value)
: [src] "m" (value)
);
return value;
}
Use it with something like uint64_t tmp = load_bigend_u64(array[i]);
You could reverse this to make a store_bigend
function, or use bswap
to modify a value in a register and let the compiler load/store it.
I change the function to return value
because alignment of vdest
was not clear to me.
Usually a feature is guarded by a preprocessor macro. I'd expect __MOVBE__
to be used for the movbe
feature flag, but its not present (this machine has the feature):
$ gcc -march=native -dM -E - < /dev/null | sort
...
#define __LWP__ 1
#define __LZCNT__ 1
#define __MMX__ 1
#define __MWAITX__ 1
#define __NO_INLINE__ 1
#define __ORDER_BIG_ENDIAN__ 4321
...