Let's consider the structs :
struct S1 { int a; char b; }; struct S2 { struct S1 s; /* struct needed to make this compile as C without typedef */ char c; }; // For the C++ fans struct S3 : S1 { char c; };
The size of S1 is 8, which is expected due to alignment. But the size of S2 and S3 is 12. Which means the compiler structure them as :
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| | a | b | padding | c | padding |
The compiler could place c in the padding in 6 7 8 without breaking alignment constraints. What is the rule that prevent it, and what is the reason behind it ?
Short answer (for the C++ part of the question): The Itanium ABI for C++ prohibits, for historical reasons, using the tail padding of a base subobject of POD type. Note that C++11 does not have such a prohibition. The relevant rule 3.9/2 that allows trivially-copyable types to be copied via their underlying representation explicitly excludes base subobjects.
Long answer: I will try and treat C++11 and C at once.
- The layout of
S1
must include padding, since S1::a
must be aligned for int
, and an array S1[N]
consists of contiguously allocated objects of type S1
, each of whose a
member must be so aligned. - In C++, objects of a trivially-copyable type
T
that are not base subobjects can be treated as arrays of sizeof(T)
bytes (i.e. you can cast an object pointer to an unsigned char *
and treat the result as a pointer to the first element of a unsigned char[sizeof(T)]
, and the value of this array determines the object). Since all objects in C are of this kind, this explains S2
for C and C++. - The interesting cases remaining for C++ are:
- base subobjects, which are not subject to the above rule (cf. C++11 3.9/2), and
- any object that is not of trivially-copyable type.
For 3.1, there are indeed common, popular "base layout optimizations" in which compilers "compress" the data members of a class into the base subobjects. This is most striking when the base class is empty (∞% size reduction!), but applies more generally. However, the Itanium ABI for C++ which I linked above and which many compilers implement forbids such tail padding compression when the respective base type is POD (and POD means trivially-copyable and standard-layout).
For 3.2 the same part of the Itanium ABI applies, though I don't currently believe that the C++11 standard actually mandates that arbitrary, non-trivially-copyable member objects must have the same size as a complete object of the same type.
Previous answer kept for reference.
I believe this is because S1
is standard-layout, and so for some reason the S1
-subobject of S3
remains untouched. I'm not sure if that's mandated by the standard.
However, if we turn S1
into non-standard layout, we observe a layout optimization:
struct EB { }; struct S1 : EB { // not standard-layout EB eb; int a; char b; }; struct S3 : S1 { char c; };
Now sizeof(S1) == sizeof(S3) == 12
on my platform. Live demo.
And here is a simpler example:
struct S1 { private: int a; public: char b; }; struct S3 : S1 { char c; };
The mixed access makes S1
non-standard-layout. (Now sizeof(S1) == sizeof(S3) == 8
.)
Update: The defining factor seems to be triviality as well as standard-layoutness, i.e. the class must be POD. The following non-POD standard-layout class is base-layout optimizable:
struct S1 { ~S1(){} int a; char b; }; struct S3 : S1 { char c; };
Again sizeof(S1) == sizeof(S3) == 8
. Demo
Let's consider some code:
struct S1 { int a; char b; }; struct S2 { S1 s; char c; };
Let's consider what would happen if sizeof(S1) == 8
and sizeof(S2) == 8
.
struct S2 s2; struct S1 *s1 = &(s2.s); memset(s1, 0, sizeof(*s1));
You've now overwritten S2::c
.
For array alignment reasons, S2
also cannot have a size of 9, 10, or 11. So the next valid size is 12.
Here are a couple of examples why a compiler can't place member c
in the trailing padding of the struct S1
member s
. Assume for the following that the compiler did place struct S2.c
in the padding of the struct S1.s.
member:
struct S1 { int a; char b; }; struct S2 { struct S1 s; /* struct needed to make this compile as C without typedef */ char c; }; // ... struct S1 foo = { 10, 'a' }; struct S2 bar = {{ 20, 'b'},