In C++, say that:
uint64_t i;
uint64_t j;
then i * j will yield an uint64_t that has as value the lower part of t
Here's the asm for ARMv8 or Aarch64 version:
// High (p1) and low (p0) product
uint64_t p0, p1;
// multiplicand and multiplier
uint64_t a = ..., b = ...;
p0 = a*b; asm ("umulh %0,%1,%2" : "=r"(p1) : "r"(a), "r"(b));
And here's the asm for old DEC compilers:
p0 = a*b; p1 = asm("umulh %a0, %a1, %v0", a, b);
If you have x86's BMI2 and would like to use mulxq:
asm ("mulxq %3, %0, %1" : "=r"(p0), "=r"(p1) : "d"(a), "r"(b));
And the generic x86 multiply using mulq:
asm ("mulq %3" : "=a"(p0), "=d"(p1) : "a"(a), "g"(b) : "cc");