Rationale behind active members of unions

问题

C++'s unions are more restrictive than those of C, because they introduce the concept of an "active member" (the one last assigned to) as the only one safe to access. The way I see it, this behavior of unions is a net negative. Can someone please explain what is gained by having this restriction?

回答1:

Short answer

In C, the union is only a question of how to interpret the data that is stored at a given location. The data is passive.

In C++, unions can have members of different classes. And class objects do not only have data, but also have a behavior. As you rely on this (accessible) behavior (and maybe can't even access the private and protected members), it must be ensured that the object remain consistent from its construction to its destruction. The notion of active member is there exactly for this purpose: ensure that the object lifecycle is consistent.

Longer explanations

Imagine the following union:

union U {
    string s;
    int i;

    // due to string, I need to define constructor and destructor
    U (string s) : s(s) { cout << "s is active"<<endl;}
    U (int i) : i(i) { cout << "i is active"<<endl;}
    U() : s() { cout << "s is active by default" <<endl; }
    ~U() { cout << "delete... but what ?"<<endl; }
};

Now suppose that I initialize it:

U u("hello");

The active member is s at that moment. I can now use this active memeber without risk:

u.s += ", world";  
cout << u.s <<endl;

Before changing the active member, I have to be sure that the lifetime of the member is ended (requirement as per C++ standard). If I forget this, and for example use another member:

u.i=0;  // ouch!!! this is not the active member : what happens to the string ?

I have then undefined behavior (in practice here, s is now corrupted and it is no longer possible to recover the memory in which the characters were stored). You could also imagine the opposite. Suppose the active member would be i, and I want now to use the string:

u.s="goodbye";  // thinks that s is an active valid string which is no longer the case

Here, the compiler assulmes that I know that s is the active member. But as s is not a properly initializeed string, performing a copy operator will also result in undefined behavior.

Demo of what you should not do

How to do it right ?

The standard explains it:

If M has a non-trivial destructor and N has a non-trivial constructor (for instance, if they declare or inherit virtual functions), the active member of u can be safely switched from m to n using the destructor and placement new-expression as follows:
u.m.~M();
new (&u.n) N;

So in our nasty example, the following would work:

u.s.~string(); // properly end the life of s
u.i=0;  // this is now the active member   
           // no need to end life of an int, as it has a trivial destructor 
new (&u.s) string("goodbye");  // placement new  
cout << u.s <<endl;

Demo of how to (almost) do it right

来源：https://stackoverflow.com/questions/46933166/rationale-behind-active-members-of-unions

标签

c++

unions