How does the GCC implementation of modulo (%) work, and why does it not use the div instruction?

前端 未结 2 959
-上瘾入骨i
-上瘾入骨i 2020-12-01 12:22

I was trying to work out how to calculate modulo 10 in assembly so i compiled the following c code in gcc to see what it came up with.

unsigned int i=999;
un         


        
2条回答
  •  半阙折子戏
    2020-12-01 13:01

    Second question first: div is a very slow instruction (more than 20 clock cycles). The sequence above consists of more instructions, but they're all relatively fast, so it's a net win in terms of speed.

    The first five instructions (up to and including the shrl) compute i/10 (I'll explain how in a minute).

    The next few instructions multiply the result by 10 again, but avoiding the mul/imul instructions (whether this is a win or not depends on the exact processor you're targeting - newer x86s have very fast multipliers, but older ones don't).

    movl    %edx, %eax   ; eax=i/10
    sall    $2, %eax     ; eax=(i/10)*4
    addl    %edx, %eax   ; eax=(i/10)*4 + (i/10) = (i/10)*5
    addl    %eax, %eax   ; eax=(i/10)*5*2 = (i/10)*10
    

    This is then subtracted from i again to obtain i - (i/10)*10 which is i % 10 (for unsigned numbers).

    Finally, on the computation of i/10: The basic idea is to replace division by 10 with multiplication by 1/10. The compiler does a fixed-point approximation of this by multiplying with (2**35 / 10 + 1) - that's the magic value loaded into edx, though it's output as a signed value even though it's really unsigned - and right-shifting the result by 35. This turns out to give the right result for all 32-bit integers.

    There's algorithms to determine this kind of approximation which guarantee that the error is less than 1 (which for integers means it's the right value) and GCC obviously uses one :)

    Final remark: If you want to actually see GCC compute a modulo, make the divisor variable (e.g. a function parameter) so it can't do this kind of optimization. Anyway, on x86, you compute modulo using div. div expects the 64-bit dividend in edx:eax (high 32 bits in edx, low 32 bits in eax - clear edx to zero if you're working with a 32-bit number) and divides that by whatever operand you specify (e.g. div ebx divides edx:eax by ebx). It returns the quotient in eax and the remainder in edx. idiv does the same for signed values.

提交回复
热议问题