Can 128bit/64bit hardware unsigned division be faster in some cases than 64bit/32bit division on x86-64 Intel/AMD CPUs?

后端 未结 2 1063
暖寄归人
暖寄归人 2020-12-21 17:47

Can a scaled 64bit/32bit division performed by the hardware 128bit/64bit division instruction, such as:

; Entry arguments: Dividend in EAX, Divisor in EBX
sh         


        
2条回答
  •  青春惊慌失措
    2020-12-21 18:11

    Can 128bit/64bit hardware unsigned division be faster in some cases than 64bit/32bit division on x86-64 Intel/AMD CPUs?

    In theory, anything is possible (e.g. maybe in 50 years time Nvidia creates an 80x86 CPU that ...).

    However, I can't think of a single plausible reason why a 128bit/64bit division would ever be faster than (not merely equivalent to) a 64bit/32bit division on x86-64.

    I suspect this because I assume that the C compiler authors are very smart and so far I have failed to make the popular C compilers generate the latter code when dividing an unsigned 32-bit integer (shifted left 32 bits) by another 32-bit integer. It always compiles to the128bit/64bit div instruction. P.S. The left shift compiles fine to shl.

    Compiler developers are smart, but compilers are complex and the C language rules get in the way. For example, if you just do a a = b/c; (with b being 64 bit and c being 32-bit) the language's rules are that c gets promoted to 64-bit before the division happens, so it ends up being a 64-bit divisor in some kind of intermediate language, and that makes it hard for the back-end translation (from intermediate language to assembly language) to tell that the 64-bit divisor could be a 32-bit divisor.

提交回复
热议问题