Should use of bit-fields of type int be discouraged? [closed]

问题

From the Draft C++ Standard (N3337):

9.6 Bit-fields

4 If the value true or false is stored into a bit-field of type bool of any size (including a one bit bit-field), the original bool value and the value of the bit-field shall compare equal. If the value of an enumerator is stored into a bit-field of the same enumeration type and the number of bits in the bit-field is large enough to hold all the values of that enumeration type (7.2), the original enumerator value and the value of the bit-field shall compare equal.

The standard is non-committal about any such behavior for bit-fields of other types. To understand how g++ (4.7.3) deals with other types of bit-fields, I used the following test program:

#include <iostream>

enum TestEnum
{
   V1 = 0,
   V2
};

struct Foo
{
   bool          d1:1;
   TestEnum      d2:1;
   int           d3:1;
   unsigned int  d4:1;
};

int main()
{
   Foo foo;
   foo.d1 = true;
   foo.d2 = V2;
   foo.d3 = 1;
   foo.d4 = 1;

   std::cout << std::boolalpha;

   std::cout << "d1: " << foo.d1 << std::endl;
   std::cout << "d2: " << foo.d2 << std::endl;
   std::cout << "d3: " << foo.d3 << std::endl;
   std::cout << "d4: " << foo.d4 << std::endl;
   std::cout << std::endl;

   std::cout << (foo.d1 == true) << std::endl;
   std::cout << (foo.d2 == V2) << std::endl;
   std::cout << (foo.d3 == 1) << std::endl;
   std::cout << (foo.d4 == 1) << std::endl;

   return 0;
}

The output:

d1: true
d2: 1
d3: -1
d4: 1

true
true
false
true

I was surprised by the lines of the output corresponding to Foo::d3. The output is the same at ideone.com.

Since the standard is non-committal about the comparision of bit-fields of type int, g++ does not seem to be in violation of the standard. That brings me to my questions.

Is use of bit-fields of type int a bad idea? Should it be discouraged?

回答1:

Yes, bit fields of type int are a bad idea, because their signedness is implementation-defined. Use signed int or unsigned int instead.

For non-bitfield declarations, the type name int is exactly equivalent to signed int (or int signed, or signed). The same pattern is followed for short, long, and long long: the unadorned type name is the signed version, and you have to add the unsigned keyword to name the corresponding unsigned type.

Bit fields, for historical reasons, are a special case. A bit-field defined with the type int is equivalent either to the same declaration with signed int, or to the same declaration with unsigned int. The choice is implementation-defined (i.e., it's up to the compiler, not to the programmer). A bit field is the only context in which int and signed int aren't (necessarily) synonymous. The same applies to char, short, long, and long long.

Quoting the C++11 standard, section 9.6 [class.bit]:

It is implementation-defined whether a plain (neither explicitly signed nor unsigned) char, short, int, long, or long long bit-field is signed or unsigned.

(I'm not entirely sure of the rationale for this. Very old versions of C didn't have the unsigned keyword, and unsigned bit fields are usually more useful than signed bit fields. It may be that some early C compilers implemented bit fields before the unsigned keyword was introduced. Making bit fields unsigned by default, even when declared as int, may have been just a matter of convenience. There's no real reason to keep the rule other than to avoid breaking old code.)

Most bit fields are intended to be unsigned, which of course means that they should be defined that way.

If you want a signed bit field (say, a 4-bit field that can represent values from -8 to +7, or from -7 to +7 on a non-two's-complement system), then you should explicitly define it as signed int. If you define it as int, then some compilers will treat it as unsigned int.

If you don't care whether your bit field is signed or unsigned, then you can define it as int -- but if you're defining a bit field, then you almost certainly do care whether it's signed or unsigned.

回答2:

You can absolutely use unsigned bit-fields of any size no greater than the size of an unsigned int. While signed bit-fields are legal (at least if the width is greater than one), I personally prefer not to use them. If, however, you do want to use a signed bit-field, you should explicitly mark it as signed because it is implementation-dependent as to whether an unqualified int bit-field is signed or unsigned. (This is similar to char, but without the complicating feature of explicitly unqualified char* literals.)

So to that extent, I agree that int bit-fields should be discouraged. [Note 1] While I don't know of any implementation in which an int bitfield is implicitly unsigned, it is certainly allowed by the standard, and consequently there is lots of opportunity for implementation-specific unanticipated behaviour if you are not explicit about signs.

The standards specify that a signed integer representation consists of optional padding bits, exactly one sign bit, and value bits. While the standard does not guarantee that there is at least one value bit, -- and as the example in the OP shows, gcc does not insist that there be -- I think it is a plausible interpretation of the standard, since it explicitly allows there to be no padding bits, and does not have any such wording corresponding to value bits.

In any case, there are only three possible signed representations allowed:

2's complement, in which a single-bit field consisting of a 1 should be interpreted as -1
1's complement and sign-magnitude. In both of these case, a single-bit field consisting of a 1 is allowed to be a trap representation, so the only number which can be represented in a 1-bit signed bit-field is 0.

Since portable code cannot assume that a 1-bit signed bit-field can represent any non-zero value, it seems reasonable to insist that a signed bit-field have at least 2 bits, regardless of whether you interpret the standard(s) to actually require that or not.

Notes:

Indeed, if it were not for the fact that string literals are explicitly unqualified, I would prefer to always specify unsigned char. But there's no way to roll-back history on that point.)

回答3:

int is signed, and in C++ Two's complement can be used, so in first int's byte sign may be stored. When there are 2 bits for an signed int, it can be equal to 1, see it working.

回答4:

This is perfectly logical. int is a signed integer type, and if the underlying architecture uses two's complement to represent signed integers (as all modern architectures do), then the high-order bit is the sign bit. So a 1-bit signed integer bitfield can take the values 0 or -1. And a 3-bit signed integer bitfield, for instance, can take values between -4 and 3 inclusive.

There is no reason for a blanket ban on signed integer bitfields, as long as you understand two's complement representation.

来源：https://stackoverflow.com/questions/25940635/should-use-of-bit-fields-of-type-int-be-discouraged

标签

c++

bit-fields