Optimize 128x128 to 256-bit multiply for Intel AVX[SIMD] [duplicate]
问题 This question already has answers here : Why _umul128 works slower than scalar code for mul128x64x2 function? (1 answer) SIMD signed with unsigned multiplication for 64-bit * 64-bit to 128-bit (2 answers) Is there hardware support for 128bit integers in modern processors? (3 answers) Is there a 128 bit integer in gcc? (3 answers) Closed 3 months ago . I'm trying to implement multiplication of 128 unsigned int on two 64 unsigned integers by Intel AVX. The problem is that non vectorised version