Why is this simple c program with gcc (clang) inline assembly exhibiting undefined behaviour?

帅比萌擦擦* 提交于 2019-12-10 11:28:58

问题


I'm trying to do a very simple thing with gcc assembler extension:

  • load an unsigned int variable into a register
  • add 1 to it
  • output the result

While compiling my solution:

#include <stdio.h>
#define inf_int volatile unsigned long long

int main(int argc, char *argv[]){
   inf_int zero = 0;
   inf_int one = 1;
   inf_int infinity = ~0;
   printf("value of zero, one, infinity = %llu, %llu, %llu\n", zero, one, infinity);
   __asm__ volatile (
      "addq $1, %0"
      : "=r" (infinity)
   );
   __asm__ volatile (
      "addq $1, %0"
      : "=r" (zero)
   );
   __asm__ volatile (
      "addq $1, %0"
      : "=r" (one)
   );
   printf("value of zero, one, infinity = %llu, %llu, %llu\n", zero, one, infinity);
   return 0;
}

with the following switches:

gcc -std=c99 --pedantic -Wall  -c main.c -o main.o
gcc -std=c99 --pedantic -Wall  main.o -o main

I'd expect the following result from running main:

value of zero, one, infinity = 0, 1, 18446744073709551615

value of zero, one, infinity = 1, 2, 0

but the result I get is this:

value of zero, one, infinity = 0, 1, 18446744073709551615

value of zero, one, infinity = 60, 61, 59

Interestingly, if I add a single char to the first printf I get the following, off-by-one, output:

value of zerao, one, infinity = 0, 1, 18446744073709551615

value of zero, one, infinity = 61, 62, 60

Even more interestingly, I can fix the behaviour by adding (optional) output registers. But this would be wasteful because of using 2*more registers, and doesn't help me understand why the previous piece exhibits undefined behaviour.

#include <stdio.h>
#define inf_int volatile unsigned long long

int main(int argc, char *argv[]){
   inf_int zero = 0;
   inf_int one = 1;
   inf_int infinity = ~0;
   printf("value of zerao, one, infinity = %llu, %llu, %llu\n", zero, one, infinity);
   __asm__ volatile (
      "addq $1, %0 \n\t"
      "movq %0, %1"
      : "=r" (zero)
      : "r" (zero)
   );
   __asm__ volatile (
      "addq $1, %0 \n\t"
      "movq %0, %1"
      : "=r" (one)
      : "r" (one)
   );
   __asm__ volatile (
      "addq $1, %0 \n\t"
      "movq %0, %1"
      : "=r" (infinity)
      : "r" (infinity)
   );
   printf("value of zero, one, infinity = %llu, %llu, %llu\n", zero, one, infinity);
   return 0;
}

edit

compiling with clang with the same options gives undefined behaviour as well:

value of zerao, one, infinity = 0, 1, 18446744073709551615

value of zero, one, infinity = 2147483590, 2147483591, 2147483592

edit 2

as suggested by Olaf, I've tried with uint64_t from stdint.h. The result of running the program is still undefined.

#include <stdio.h>
#include <stdint.h>
//#define inf_int volatile unsigned long long
#define inf_int uint64_t
int main(int argc, char *argv[]){
   inf_int zero = 0;
   inf_int one = 1;
   inf_int infinity = ~0;
   printf("value of zerao, one, infinity = %lu, %lu, %lu\n", zero, one, infinity);
   __asm__ volatile (
      "addq $1, %0 \n\t"
      : "=r" (zero)
   );
   __asm__ volatile (
      "addq $1, %0 \n\t"
      : "=r" (one)
   );
   __asm__ volatile (
      "addq $1, %0 \n\t"
      : "=r" (infinity)
   );
   printf("value of zero, one, infinity = %lu, %lu, %lu\n", zero, one, infinity);
   return 0;
}

回答1:


Your first code does not specify any inputs to the asm statements so the chosen register has an undefined value (which in this case was initially the return value of printf). The second example repeats the error of using an undefined value and adds further undefined behaviour by overwriting the input register with the output.

You could use two registers like:

__asm__ (
   "movq %1, %0 \n\t"
   "addq $1, %0"
   : "=r" (zero)
   : "r" (zero)
);

You could use an input/output argument:

__asm__ (
   "addq $1, %0"
   : "+r" (zero)
);

Which can be in memory as well as a register:

__asm__ (
   "addq $1, %0"
   : "+rm" (zero)
);

Or you could tie the input to the output:

__asm__ (
   "addq $1, %0"
   : "=rm" (zero)
   : "0" (zero)
);

And finally there is no need for any of the volatile modifiers.




回答2:


To wrap it all up:

inline assembly is not the part of C standard, it's an extension so portability (even across compilers on the same hardware) is not guaranteed.

one good way to write it is following:

#include <stdio.h>
#include <stdint.h>
#define inf_int uint64_t
int main(int argc, char *argv[]){
   inf_int zero = 0;
   inf_int one = 1;
   inf_int infinity = ~0;
   printf("value of zero, one, infinity = %lu, %lu, %lu\n", zero, one, infinity);
   __asm__ (
      "addq $1, %0 \n\t"
      : "+r" (zero)
   );
   __asm__ (
      "addq $1, %0 \n\t"
      : "+r" (one)
   );
   __asm__ (
      "addq $1, %0 \n\t"
      : "+r" (infinity)
   );
   printf("value of zero, one, infinity = %lu, %lu, %lu\n", zero, one, infinity);
   return 0;
}


来源:https://stackoverflow.com/questions/31688987/why-is-this-simple-c-program-with-gcc-clang-inline-assembly-exhibiting-undefin

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!