gcc inline assembly behave strangely

问题

I am learning GCC's extended inline assembly currently. I wrote an A + B function and wants to detect the ZF flag, but things behave strangely.

The compiler I use is gcc 7.3.1 on x86-64 Arch Linux.

I started from the following code, this code will correctly print the a + b.

int a, b, sum;
scanf("%d%d", &a, &b);
asm volatile (
  "movl %1, %0\n"
  "addl %2, %0\n"
  : "=r"(sum)
  : "r"(a), "r"(b)
  : "cc"
);
printf("%d\n", sum);

Then I simply added a variable to check flags, it gives me wrong output.

int a, b, sum, zero;
scanf("%d%d", &a, &b);
asm volatile (
  "movl %2, %0\n"
  "addl %3, %0\n"
  : "=r"(sum), "=@ccz"(zero)
  : "r"(a), "r"(b)
  : "cc"
);
printf("%d %d\n", sum, zero);

The GAS assembly output is

  movl  -24(%rbp), %eax  # %eax = a
  movl  -20(%rbp), %edx  # %edx = b
#APP
# 6 "main.c" 1
  movl %eax, %edx
  addl %edx, %edx

# 0 "" 2
#NO_APP
  sete  %al
  movzbl  %al, %eax
  movl  %edx, -16(%rbp)  # sum = %edx
  movl  %eax, -12(%rbp)  # zero = %eax

This time, the sum will become a + a. But when I just exchanged %2 and %3, the output will be correct a + b.

Then I checked various gcc version (It seems clang does not support it when output is a flag) on wandbox.org, from version 4.5.4 to version 4.7.4 gives the correct result a + b, and starting from version 4.8.1 the outputs are all a + a.

My question is: did I write the wrong code or is there anything wrong with gcc?

回答1:

The problem is that you clobber %0 before all the inputs (%2 in your case) are consumed:

"movl %1, %0\n"
"addl %2, %0\n"

%0 is being modified by the first MOV before %2 has been consumed. It is possible for an optimizing compiler to re-use a register for an input constraint that was used for an output constraint. In your case one of the compilers chose to use the same register for %2 and %0 which caused the erroneous results.

To get around this problem of changing a register that is being modified before all the inputs are consumed is to mark the output constraint with a &. The & is a modifier denoting Early Clobber:

‘&’ Means (in a particular alternative) that this operand is an earlyclobber operand, which is written before the instruction is finished using the input operands. Therefore, this operand may not lie in a register that is read by the instruction or as part of any memory address.

‘&’ applies only to the alternative in which it is written. In constraints with multiple alternatives, sometimes one alternative requires ‘&’ while others do not. See, for example, the ‘movdf’ insn of the 68000.

A operand which is read by the instruction can be tied to an earlyclobber operand if its only use as an input occurs before the early result is written. Adding alternatives of this form often allows GCC to produce better code when only some of the read operands can be affected by the earlyclobber. See, for example, the ‘mulsi3’ insn of the ARM.

Furthermore, if the earlyclobber operand is also a read/write operand, then that operand is written only after it’s used.

‘&’ does not obviate the need to write ‘=’ or ‘+’. As earlyclobber operands are always written, a read-only earlyclobber operand is ill-formed and will be rejected by the compiler.

The change to your code would be to modify "=r"(sum) to be "=&r"(sum). This will prevent the compiler from using the register used for the output constraint for one of the input constraints.

Word of warning. GCC Inline Assembly is powerful and evil. Very easy to get wrong if you don't know what you are doing. Only use it if you must, avoid it if you can.

来源：https://stackoverflow.com/questions/49555291/gcc-inline-assembly-behave-strangely

标签

gcc

assembly

x86

inline-assembly

att