Mismatch of 'this' address when base class is not polymorphic but derived is

﹥>﹥吖頭↗ 提交于 2019-11-30 09:06:04

When you have a polymorphic single-inheritance hierarchy of classes, the typical convention followed by most (if not all) compilers is that each object in that hierarchy has to begin with a VMT pointer (a pointer to Virtual Method Table). In such case the VMT pointer is introduced into the object memory layout early: by the root class of the polymorphic hierarchy, while all lower classes simply inherit it and set it to point to their proper VMT. In such case all nested subobjects within any derived object have the same this value. That way by reading a memory location at *this the compiler has immediate access to VMT pointer regardless of the actual subobject type. This is exactly what happens in your last experiment. When you make the root class polymorphic, all this values match.

However, when the base class in the hierarchy is not polymorphic, it does not introduce a VMT pointer. The VMT pointer will be introduced by the very first polymorphic class somewhere lower in the hierarchy. In such case a popular implementational approach is to insert the VMT pointer before the data introduced by the non-polymorphic (upper) part of the hierarchy. This is what you see in your second experiment. The memory layout for Derived looks as follows

+------------------------------------+ <---- `this` value for `Derived` and below
| VMT pointer introduced by Derived  |
+------------------------------------+ <---- `this` value for `Base` and above
| Base data                          |
+------------------------------------+
| Derived data                       |
+------------------------------------+

Meanwhile, all classes in the non-polymorphic (upper) part of the hierarchy should know nothing about any VMT pointers. Objects of Base type must begin with data field Base::x. At the same time all classes in the polymorphic (lower) part of the hierarchy must begin with VMT pointer. In order to satisfy both of these requirements, the compiler is forced to adjust the object pointer value as it is converted up and down the hierarchy from one nested base subobject to another. That immediately means that pointer conversion across the polymorphic/non-polymorphic boundary is no longer conceptual: the compiler has to add or subtract some offset.

The subobjects from non-polymorphic part of the hierarchy will share their this value, while subobjects from the polymorphic part of hierarchy will share their own, different this value.

Having to add or subtract some offset when converting pointer values along the hierarchy is not unusual: the compiler has to do it all the time when dealing with multiple-inheritance hierarchies. However, you example shows how it can be achieved in single-inheritance hierarchy as well.

The addition/subtraction effect will also be revealed in a pointer conversion

Derived *pd = new Derived;
Base *pb = pd; 
// Numerical values of `pb` and `pd` are different if `Base` is non-polymorphic
// and `Derived` is polymorphic

Derived *pd2 = static_cast<Derived *>(pb);
// Numerical values of `pd` and `pd2` are the same

This looks like behavior of a typical implementation of polymorphism with a v-table pointer in the object. The Base class doesn't require such a pointer since it doesn't have any virtual methods. Which saves 4 bytes in the object size on a 32-bit machine. A typical layout is:

+------+------+------+
|   x  |   y  |   z  |
+------+------+------+

    ^
    | this

The Derived class however does require the v-table pointer. Typically stored at offset 0 in the object layout.

+------+------+------+------+
| vptr |   x  |   y  |   z  |
+------+------+------+------+

    ^
    | this

So to make the Base class methods see the same layout of the object, the code generator adds 4 to the this pointer before calling a method of the Base class. The constructor sees:

+------+------+------+------+
| vptr |   x  |   y  |   z  |
+------+------+------+------+
           ^
           | this

Which explains why you see 4 added to the this pointer value in the Base constructor.

Emilio Garavaglia

Technically speaking, this is exactly what happens.

However it must be noted that according to the language specification, the implementation of polymorphism does not necessarily relate to vtables: this is what the spec. defines as "implementation detail", that is out of the specs scope.

All what we can say is that this has a type, and points to what is accessible through its type. How the dereferencing into members happens, again, is an implementation detail.

The fact that a pointer to something when converted into a pointer to something else, either by implicit, static or dynamic conversion, has to be changed to accommodate what is around must be considered the rule, not the exception.

By the way C++ is defined, the question is meaningless, as are the answers, since they assume implicitly that the implementation is based on the supposed layouts.

The fact that, under given circumstances, two object sub-components share a same origin, is just a (very common) particular case.

The exception is "reinterpreting": when you "blind" the type system, and just say "look this bunch of bytes as they are an instance of this type": that's the only case you have to expect no address change (and no responsibility from the compiler about the meaningfulness of such a conversion).

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!