Weird results for conditional operator with GCC and bool pointers

删除回忆录丶 提交于 2020-06-27 06:42:11

问题


In the following code, I memset() a stdbool.h bool variable to value 123. (Perhaps this is undefined behaviour?) Then I pass a pointer to this variable to a victim function, which tries to protect against unexpected values using a conditional operation. However, GCC for some reason seems to remove the conditional operation altogether.

#include <stdio.h>
#include <stdbool.h>
#include <string.h>

void victim(bool* foo)
{
    int bar = *foo ? 1 : 0;
    printf("%d\n", bar);
}

int main()
{
    bool x;
    bool *foo = &x;
    memset(foo, 123, sizeof(bool));
    victim(foo);
    return 0;
}
user@host:~$ gcc -Wall -O0 test.c
user@host:~$ ./a.out 
123

What makes this particularly annoying is that the victim() function is actually inside a library, and will crash if the value is more than 1.

Reproduced on GCC versions 4.8.2-19ubuntu1 and 4.7.2-5. Not reproduced on clang.


回答1:


(Perhaps this is undefined behaviour?)

Not directly, but reading from the object afterwards is.

Quoting C99:

6.2.6 Representations of types

6.2.6.1 General

5 Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. [...]

Basically, what this means is that if a particular implementation has decided that the only two valid bytes for a bool are 0 and 1, then you'd better make sure you don't use any trickery to attempt to set it to any other value.




回答2:


When GCC compiles this program, the assembly language output includes the sequence

movzbl (%rax), %eax
movzbl %al, %eax
movl %eax, -4(%rbp)

which does the following:

  1. Copy 32 bits from *foo (denoted by (%rax) in assembly) to the register %eax and fill in the higher-order bits of %eax with zeros (not that there are any, because %eax is a 32-bit register).
  2. Copy the low-order 8 bits of %eax (denoted by %al) to %eax and fill in the higher-order bits of %eax with zeros. As a C programmer you would understand this as %eax &= 0xff.
  3. Copy the value of %eax to 4 bytes above %rbp, which is the location of bar on the stack.

So this code is an assembly-language translation of

int bar = *foo & 0xff;

Clearly GCC has optimized the line based on the fact that a bool should never hold any value other than 0 or 1.

If you change the relevant line in the C source to this

int bar = *((int*)foo) ? 1 : 0;

then the assembly changes to

movl (%rax), %eax
testl %eax, %eax
setne %al
movzbl %al, %eax
movl %eax, -4(%rbp)

which does the following:

  1. Copy 32 bits from *foo (denoted by (%rax) in assembly) to the register %eax.
  2. Test 32 bits of %eax against itself, which means ANDing it with itself and setting some flags in the processor based on the result. (The ANDing is unnecessary here, but there's no instruction to simply check a register and set flags.)
  3. Set the low-order 8 bits of %eax (denoted by %al) to 1 if the result of the ANDing was 0, or to 0 otherwise.
  4. Copy the low-order 8 bits of %eax (denoted by %al) to %eax and fill in the higher-order bits of %eax with zeros, as in the first snippet.
  5. Copy the value of %eax to 4 bytes above %rbp, which is the location of bar on the stack; also as in the first snippet.

This is actually a faithful translation of the C code. And indeed, if you add the cast to (int*) and compile and run the program, you'll see that it does output 1.




回答3:


Storing a value different than 0 or 1 in a bool is undefined behavior in C.

So actually this:

int bar = *foo ? 1 : 0;

is optimized with something close to this:

int bar = *foo ? *foo : 0;


来源:https://stackoverflow.com/questions/27661768/weird-results-for-conditional-operator-with-gcc-and-bool-pointers

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!