I would like my C function to efficiently compute the high 64 bits of the product of two 64 bit signed ints. I know how to do this in x86-64 assembly, with imulq and pullin
If you're using a relatively recent GCC on x86_64:
int64_t mulHi(int64_t x, int64_t y) {
return (int64_t)((__int128_t)x*y >> 64);
}
At -O1 and higher, this compiles to what you want:
_mulHi:
0000000000000000 movq %rsi,%rax
0000000000000003 imulq %rdi
0000000000000006 movq %rdx,%rax
0000000000000009 ret
I believe that clang and VC++ also have support for the __int128_t type, so this should also work on those platforms, with the usual caveats about trying it yourself.