Unsigned 64-bit to double conversion: why this algorithm from g++

后端 未结 1 658
不知归路
不知归路 2021-02-20 06:20

Using g++ 4.9.2 if I compile

bool int_dbl_com(const unsigned long long x, const double y)
{
    return x <= y;
}

then the assembler output (

1条回答
  •  半阙折子戏
    2021-02-20 06:48

    .L2:
    movq       %rcx, %rax
    andl       $1, %ecx       ; save the least significant bit of %rax
    pxor       %xmm0, %xmm0
    shrq       %rax           ; make %rax represent half the original number, as a signed value
    orq        %rcx, %rax     ; “round to odd”: if the division by two above was not exact, ensure the result is odd
    cvtsi2sdq  %rax, %xmm0    ; convert to floating-point
    addsd      %xmm0, %xmm0   ; multiply by two
    ucomisd    %xmm0, %xmm1   ; compare …
    setae      %al
    ret
    

    The last three instructions implement <= and return from the source code. The other ones are all part of the conversion from uint64_t to double.

    The difficult-to-understand step is the one I commented as “round to odd”. “Rounding to odd” is a technique that prevents the nasty effects of “double rounding”.

    In effect, the algorithm is to convert from 64-bit to 63-bit, and then from 63 bits to the 53-bit significand of IEEE 754 binary64. If implemented naively, in some cases, these two conversions can produce a result that differs from a direct single conversion from 64-bit integer to floating-point with 53-bit significand. This phenomenon is what is called “double rounding”.

    Rounding to odd ensures that the result of the intermediate rounding is not to a value that would be rounded in the wrong direction in a case of double-rounding. This is enough to make the sequences below equivalent for all inputs:

    64-bit ---(round to odd)---> 63-bit ---(round to nearest even)----> binary64 
    64-bit -(round-to-nearest-even,the conversion the compiler wants)-> binary64
    

    To answer other aspects of your question:

    But that's not what the code does: it seems to save the lowest-order bit of %rcx and then ors this back to the result. Why?? And why bother when these low order bits are going to be lost anyway (or am I mistaken here)?

    This is exactly how to implement round-to-odd in this particular instance. The least significant bit of %rcx is one iff the shift is not an exact division by two, and in this case, the result must be made odd.

    The same algorithm seems to be used regardless of the optimisation; I used -O3 here to make it easier to see.

    The instruction sequence is optimal (as far as I can see, for modern processors) and corresponds to the source-level conversion from uint64_t int to double. It requires no effort from the compiler to use it even at the lowest optimization level. What could happen with optimization (but doesn't happen here) is that the instructions are melded with other instructions that correspond to other source-level constructs. But there is no point in having a different instruction sequence than the optimal one to generate for conversions at -O0.

    0 讨论(0)
提交回复
热议问题