Accessing the bits in char through a bitfield

假装没事ソ 提交于 2019-12-24 07:25:06

问题


I want to access the bits in a char individually. There are several questions and answers on this topic here on SO, but they all suggest to use boolean mathematics. However, for my use it would be more convenient if I could simply name the bits separately. So I was thinking of just accessing the char through a bitfield, like so

#include <stdbool.h>
#include <stdio.h>

typedef struct {
    bool _1 : 1, _2 : 1, _3 : 1, _4 : 1, _5 : 1, _6 : 1, _7 : 1, _8 : 1;
} bits;

int main() {
    char c = 0;
    bits *b = (bits *)&c;
    b->_3 = 1;
    printf("%s\n", c & 0x4 ? "true" : "false");
}

This compiles without errors or warnings with gcc -Wall -Wextra -Wpedantic test.c. When running the resulting executable with valgrind it reports no memory faults. The assembly generated for the b->_3 = 1; assignment is or eax, 4 which is sound.

Questions

  • Is this defined behaviour in C?
  • Is this defined behaviour in C++?

N.B.: I'm aware that it might cause trouble for mixed endianness but I only have little endian.


回答1:


Is this defined behaviour in C?
Is this defined behaviour in C++?

TL;DR: no it is not.

The boolean bitfield is well-defined as far as: bool is an ok type to use for bit-fields, so you are guaranteed to get a blob of 8 booleans allocated somewhere in memory. If you access boolean _1, you'll get the same value as last time you accessed that variable.

What is not defined is the bit order. The compiler may insert padding bits or padding bytes as it pleases. All of that is implementation-defined and non-portable. So you can't really know where _1 is located in memory or if it is the MSB or LSB. None of that is well-defined.

However, bits *b = (bits *)&c; accessing a char through a struct pointer is a strict aliasing violation and may also cause alignment problems. It is undefined behavior in C and C++ both. You would need to at least show this struct into a union with a char to dodge strict aliasing, but you may still get alignment hiccups (and C++ frowns at type punning through unions).

(And going from boolean type to character type can give some real crazy results too, see _Bool type and strict aliasing)


None of this is convenient at all - bitfields are very poorly defined. It is much better to simply do:

c |= 1u << n;     // set bit n
c &= ~(1u << n);  // clear bit n

This is portable, type generic and endianess-independent.

(Though to dodge change of signedness due to implicit integer promotions, it is good practice to always cast the result of ~ back to the intended type: c &= (uint8_t) ~(1u << n);).

Note that the type char is entirely unsuitable for bitwise arithmetic since it may or may not be signed. Instead you should use unsigned char or preferably uint8_t.



来源:https://stackoverflow.com/questions/55531589/accessing-the-bits-in-char-through-a-bitfield

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!