Intrinsics for 128 multiplication and division

后端未结

关注

 2  735

In x86_64 I know that the mul and div opp codes support 128 integers by putting the lower 64 bits in the rax and the upper in the rdx registers. I was looking for some sort

相关标签:

2条回答

故里飘歌

2020-12-21 12:40

Last I looked into it the intrinsic were in a state of flux. The main reason for the intrinsics in this case appears to be due to the fact that MSVC in 64-bit mode does not allow inline assembly.

With MSVC (and I think ICC) you can use _umul128 for mul and _mulx_u64 for mulx. These don't work in GCC , at least not GCC 4.9 (_umul128 is much older than GCC 4.9). I don't know if GCC plans to support these since you can get mul and mulx indirectly through __int128 (depending on your compile options) or directly through inline assembly.

__int128 works fine until you need a larger type and a 128-bit carry. Then you need adc, adcx, or adox and these are even more of a problem with intrinsics. Intel's documentation disagree's with MSVC and the compilers don't seem to produce adox yet with these intrinsics. See this question: _addcarry_u64 and _addcarryx_u64 with MSVC and ICC.

Inline assembly is probably the best solution with GCC (and probably even ICC).

0 讨论(0)
发布评论:

提交评论
- 加载中...
耶瑟儿～

2020-12-21 12:46

gcc has __int128 and __uint128 types.

Arithmetic with them should be using the right assembly instructions when they exist; I've used them in the past to get the upper 64 bits of a product, although I've never used it for division. If it's not using the right ones, submit a bug report / feature request as appropriate.

0 讨论(0)
发布评论:

提交评论
- 加载中...