How do I negate a 64-bit integer stored in a 32-bit register pair?

允我心安 提交于 2021-01-29 03:14:56

问题


I've stored a 64-bit integer in the EDX:EAX register pair. How can I correctly negate the number?

For example: 123456789123-123456789123.


回答1:


Ask a compiler for ideas: compile int64_t neg(int64_t a) { return -a; } in 32-bit mode. Of course, different ways of asking the compiler will have the starting value in memory, in the compiler's choice of registers, or already in EDX:EAX. See all three ways on the Godbolt compiler explorer, with asm output from gcc, clang, and MSVC (aka CL).

There are of course lots of ways to accomplish this, but any possible sequence will need some kind of carry from low to high at some point, so there's no efficient way to avoid SBB or ADC.


If the value starts in memory, or you want to keep the original value in registers, xor-zero the destination and use SUB/SBB. The SysV x86-32 ABI passes args on the stack and returns 64-bit integers in EDX:EAX. This is what clang3.9.1 -m32 -O3 does, for neg_value_from_mem:

    ; optimal for data coming from memory: just subtract from zero
    xor     eax, eax
    xor     edx, edx
    sub     eax, dword ptr [esp + 4]
    sbb     edx, dword ptr [esp + 8]

If you have the values in registers and don't need the result in-place, you can use NEG to set a register to 0 - itself, setting CF iff the input is non-zero. i.e. the same way SUB would. Note that xor-zeroing is cheap, and not part of the latency critical path, so this is definitely better than gcc's 3-instruction sequence (below).

    ;; partially in-place: input in ecx:eax
    xor     edx, edx
    neg     eax         ; eax = 0-eax, setting flags appropriately
    sbb     edx, ecx    ;; result in edx:eax

Clang does this even for the in-place case, even though that costs an extra mov ecx,edx. That's optimal for latency on modern CPUs that have zero-latency mov reg,reg (Intel IvB+ and AMD Zen), but not for number of fused-domain uops (frontend throughput) or code-size.


gcc's sequence is interesting and not totally obvious. It saves an instruction vs. clang for the in-place case, but it's worse otherwise.

    ; gcc's in-place sequence, only good for in-place use
    neg     eax
    adc     edx, 0
    neg     edx
       ; disadvantage: higher latency for the upper half than subtract-from-zero
       ; advantage: result in edx:eax with no extra registers used

Unfortunately, gcc and MSVC both always use this, even when xor-zero + sub/sbb would be better.


For a more complete picture of what compilers do, have a look at their output for these functions (on godbolt)

#include <stdint.h>

int64_t neg_value_from_mem(int64_t a) {
     return -a;
}

int64_t neg_value_in_regs(int64_t a) {
    // The OR makes the compiler load+OR first
    // but it can choose regs to set up for the negate
    int64_t reg = a | 0x1111111111LL;
    // clang chooses mov reg,mem   / or reg,imm8 when possible,
    // otherwise     mov reg,imm32 / or reg,mem.  Nice :)
    return -reg;
}

int64_t foo();
int64_t neg_value_in_place(int64_t a) {
    // foo's return value will be in edx:eax
    return -foo();
}


来源:https://stackoverflow.com/questions/41080161/how-do-i-negate-a-64-bit-integer-stored-in-a-32-bit-register-pair

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!