问题
Consider this union:
union A{
int a;
struct{
int b;
} c;
};
c
and a
are not layout-compatibles types so it is not possible to read the value of b
through a
:
A x;
x.c.b=10;
x.a+x.a; //undefined behaviour (UB)
Trial 1
For the case below I think that since C++17, I also get an undefined behavior:
A x;
x.a=10;
auto p = &x.a; //(1)
x.c.b=12; //(2)
*p+*p; //(3) UB
Let's consider [basic.type]/3:
Every value of pointer type is one of the following:
- a pointer to an object or function (the pointer is said to point to the object or function), or
- a pointer past the end of an object ([expr.add]), or
- the null pointer value ([conv.ptr]) for that type, or
- an invalid pointer value.
Let's call this 4 pointer values categories as pointer value genre.
The value of a pointer may transition from of the above mentioned genre to an other, but the standard is not really explicit about that. Fill free to correct me if I am wrong. So I suppose that at (1) the value of p
is a pointer to value. Then in (2) a
life ends and the value of p
becomes an invalid pointer value. So in (3) I get UB because I try to access the value of an object (a
) out of its lifetime.
Trial 2
Now consider this weird code:
A x;
x.a=10;
auto p = &x.a; //(1)
x.c.b=12; //(2)
p = reinterpret_cast<int*>(p); //(2')
*p+*p; //(3) UB?
Could the reinterpret_cast<int*>(p)
change the pointer value genre from invalid pointer value
to a pointer to
value.
reinterpret_cast<int*>(p)
is defined to be equivalent to static_cast<int*>(static_cast<void*>(p))
, so let's consider how is defined the static_cast
from void*
to int*
, [expr.static.cast]/13:
A prvalue of type “pointer to
cv1 void
” can be converted to a prvalue of type “pointer tocv2 T
”, whereT
is an object type andcv2
is the same cv-qualification as, or greater cv-qualification than,cv1
. If the original pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement ofT
, then the resulting pointer value is unspecified. Otherwise, if the original pointer value points to an object a, and there is an object b of typeT
(ignoring cv-qualification) that is pointer-interconvertible with a, the result is a pointer to b. Otherwise, the pointer value is unchanged by the conversion.
So in our case the original pointer pointed to the object a
. So I suppose the reinterpret_cast
will not help because a
is not within its lifetime. Is my reading to strict? Could this code be well defined?
回答1:
Then in (2) a life ends and the value of p becomes an invalid pointer value.
Incorrect. Pointers only become invalid when they point into memory that has ended its storage duration.
The pointer in this case becomes a pointer to an object outside of its lifetime. The object it points to is gone, but the pointer is not "invalid" in the way the specification means it. [basic.life] spends quite a bit of time explaining what you can and cannot do to pointers to objects outside of their lifetime.
reinterpret_cast
cannot turn a pointer to an object outside of its lifetime into a pointer to a different object that is within its lifetime.
回答2:
The notion of objects in the standard is rather abstract and differs somewhat from intuition. An object may be within its lifetime or not, and objects not within their lifetimes can have the same address, this is why unions work at all: the definition of active member is "the member that is within its lifetime".
A pointer to an object not within its lifetime is still a pointer to object. reinterpret_cast
only casts between the type of the pointer, but not its validity. The UB you get with casting to non-pointer-interconvertible types are due to the strict-aliasing rule, not due to the validity of the pointer.
In all your trials, including your follow up question, you are using an object not within its lifetime in ways that aren't allowed, ie accessing it, and are consequently UB.
回答3:
Every version to date of the C and C++ Standards has been ambiguous or contradictory with regard to what can be done with addresses of union members. The authors of the C Standard didn't want to require that compilers make pessimistic allowances for the possibility that functions might be invoked by constructs like:
someFunction(&myUnion.member1, &myUnion.member2);
in cases where function would cause the value one member of myUnion
would be changed between access made via the other. While the ability to take union members' addresses would have been pretty useless if code couldn't do things like:
someFunction1(&myUnion.member1);
someFunction2(&myUnion.member2);
someFunction3(&myUnion.member1);
the authors of the Standard expected that quality implementations intended for various purposes would process constructs that Undefined Behavior "in a documented fashion characteristic of the environment" when doing so would best serve those purposes, and thus thought that making support for such constructs be a quality-of-implementation issue would be simpler than trying to formulate precise rules for which patterns must be supported. A compiler that generated code for the called functions in the second example without knowing their calling context wouldn't be able to interleave accesses performed by the two functions, and a quality compiler that expanded them inline while processing the above code would have no trouble noticing when each pointer was derived from myUnion
.
The authors of the C89 Standard didn't think it necessary to define precise rules for how pointers to union members behave, because they thought compiler writers' desire to produce quality implementations would drive them to handle appropriate cases sensibly even without such rules. Unfortunately, some compiler writers were too lazy to handle cases like the second example above, and rather than recognizing that there was never any reason for quality compilers to be incapable of handling such cases, the authors of later C and C++ Standards have bent over backward to come up with weirdly contorted, ambiguous, and contradictory rules that justify such compiler behavior.
As a result, the address-of operator should only be regarded as meaningfully applicable to union members in cases where the resulting pointer will be used for accessing individual bytes of storage, either using character-types directly, or passing to functions like memcpy
that are defined in such fashion. Unless or until there's a major revamp of the Standard, or an appendix that describes means by which implementations can offer optional guarantees beyond what the Standard requires, it would be best to pretend that union members are--like bitfields--lvalues that don't have addresses.
来源:https://stackoverflow.com/questions/56307775/could-reinterpret-cast-turn-an-invalid-pointer-value-into-a-valid-one