Avoiding object slicing

问题

So I am refreshing on C++, and honestly it's been awhile. I made a console pong game as a sort of refresher task and got some input on using polymorphism for my classes to derive from a base "GameObject" (that has some base methods for drawing objects to the screen).

One of the pieces of input was (and I had subsequently asked about) was how memory worked when deriving from base classes. Since I hadn't really done much advanced C++.

For instance lets say we have a base class, for now it just has a "draw" method (Btw why do we need to say virtual for it?), since all other derived objects really only share one common method, and that's being drawn:

class GameObject
{
public:

    virtual void Draw( ) = 0;
};

we also have a ball class for instance:

class Ball : public GameObject

The input I received is that in a proper game these would probably be kept in some sort of vector of GameObject pointers. Something like this: std::vector<GameObject*> _gameObjects;

(So a vector of pointers to GameObjects) (BTW Why would we use pointers here? why not just pure GameObjects?). We would instantiate one of these gameObjects with something like:

_gameObjects.push_back( new Ball( -1, 1, boardWidth / 2, boardHeight / 2 ); );

(new returns a pointer to the object correct? IIRC). From my understanding if I tried to do something like:

Ball b;
GameObject g = b;

That things would get messed up (as seen here: What is object slicing?)

However...am I not simply creating Derived objects on their own when I do the new Ball( -1, 1, boardWidth / 2, boardHeight / 2 ); or is that automatically assigning it as a GameObject too? I can't really figure out why one works and one doesn't. Does it have to do with creating an object via new vs just Ball ball for example?

Sorry if the question makes no sense, im just trying to understand how this object slicing would happen.

回答1:

The fundamental issue is copying an object (which is not an issue in languages where classes are "reference types", but in C++ the default is to pass things by value, i.e. making a copy). "Slicing" means copying the value of a bigger object (of type B, which derives from A) into a smaller object (of type A). Because A is smaller, only a partial copy is made.

When you create a container, its elements are full objects of their own. For example:

std::vector<int> v(3);  // define a vector of 3 integers
int i = 42;
v[0] = i;  // copy 42 into v[0]

v[0] is an int variable, just like i.

The same thing happens with classes:

class Base { ... };
std::vector<Base> v(3);  // instantiates 3 Base objects
Base x(42);
v[0] = x;

The last line copies the contents of the x object into the v[0] object.

If we change the type of x like this:

class Derived : public Base { ... };
std::vector<Base> v(3);
Derived x(42, "hello");
v[0] = x;

... then v[0] = x tries to copy the contents of a Derived object into a Base object. What happens in this case is that all members declared in Derived are ignored. Only the data members declared in the base class Base are copied, because that's all v[0] has room for.

What a pointer gives you is the ability to avoid copying. When you do

T x;
T *ptr = &x;

, ptr is not a copy of x, it just points to x.

Similarly, you can do

Derived obj;
Base *ptr = &obj;

&obj and ptr have different types (Derived * and Base *, respectively), but C++ allows this code anyway. Because Derived objects contain all members of Base, it's OK to let a Base pointer point at a Derived instance.

What this gives you is essentially a reduced interface to obj. When accessed through ptr, it only has the methods declared in Base. But because no copying was done, all data (including the Derived specific parts) are still there and can be used internally.

As for virtual: Normally, when you call a method foo through an object of type Base, it will invoke exactly Base::foo (i.e. the method defined in Base). This happens even if the call is made through a pointer that actually points at a derived object (as described above) with a different implementation of the method:

class Base {
    public:
    void foo() const { std::cout << "hello from Base::foo\n"; }
};

class Derived : public Base {
    public:
    void foo() const { std::cout << "hello from Derived::foo\n"; }
};

Derived obj;
Base *ptr = &obj;
obj.foo();  // calls Derived::foo
ptr->foo();  // calls Base::foo, even though ptr actually points to a Derived object

By marking foo as virtual, we force the method call to use the actual type of the object, instead of the declared type of the pointer the call is made through:

class Base {
    public:
    virtual void foo() const { std::cout << "hello from Base::foo\n"; }
};

class Derived : public Base {
    public:
    void foo() const { std::cout << "hello from Derived::foo\n"; }
};

Derived obj;
Base *ptr = &obj;
obj.foo();  // calls Derived::foo
ptr->foo();  // also calls Derived::foo

virtual has no effect on normal objects because there the declared type and the actual type are always the same. It only affects method calls made through pointers (and references) to objects, because those have the ability to refer to other objects (of potentially different types).

And that is another reason to store a collection of pointers: When you have several different subclasses of GameObject, all of which implement their own custom draw method, you want the code to pay attention to the actual types of the objects, so the right method gets called in each case. If draw weren't virtual, your code would attempt to invoke GameObject::draw, which doesn't exist. Depending on how exactly you code it, this either wouldn't compile in the first place or abort at runtime.

回答2:

Object slicing happens when you directly store objects in a container. No slicing can occur when you store pointers (or better smart pointers) to objects. So it you store a Ball in a vector<GameObject> it will be sliced, but if you store a Ball * in a vector<GameObject *>, all will be fine.

回答3:

The quick answer to your question is that object slicing is not an issue when you do _gameObjects.push_back( new Ball( ... )) because new allocates enough memory for a Ball-sized object.

Here's the explanation. Object slicing is an issue where the compiler believes an object to be smaller than it actually is. So in your code example:

Ball b;
GameObject g = b;

The compiler has reserved enough space for a GameObject named g, and yet you are trying to put a Ball (b) there. But a Ball may be bigger than a GameObject, and then data will get lost and bad stuff will likely start to happen.

However, when you do new Ball(...) or new GameObject(...), the compiler knows exactly how much space to allocate because it knows the true type of the object. Then, what you store is actually a Ball* or GameObject*. And you can safely store a Ball* in a GameObject* type because the pointers are the same size, so object slicing does not occur. The memory pointed at can be any number of different sizes, but the pointers will always be the same size.

回答4:

Btw why do we need to say virtual for it?

If you don't declare a function to be virtual, then you cannot call the function with virtual dispatch. When a function is called virtually through a pointer or reference to a base class, then the call is dispatched to an override in the most derived class (if any exist). In other words, virtual allows runtime polymorphism.

If the function is non-virtual, then the function can only be dispatched statically. When a function is called statically, the function of the compile time type is called. So, if a function is called statically though a base pointer, then the base function is called, not a derived override.

BTW Why would we use pointers here? why not just pure GameObjects?

GameObject is an abstract class, so you cannot have concrete objects of that type. Since you cannot have a concrete GameObject, you cannot have an array (nor vector) of them either. GameObject instances can only exist as a base class sub object of a derived type.

new returns a pointer to the object correct?

new creates an object in dynamic storage, and returns pointer to that object.

By the way, if you fail to call delete on the pointer before losing the pointer value, you have a memory leak. Oh, and if you attempt to delete something twice, or delete something that didn't originate from new, the behaviour of your program will be undefined. Memory allocation is difficult, and you should always use smart pointers to manage it. A vector of bare owning pointers such as in your example is a very bad idea.

Furthermore, deleting an object through a base object pointer has undefined behaviour unless the destructor of the base class is virtual. The destructor of GameObject is not virtual, so there is no way for your program to avoid either UB or memory leak. Both options are bad. Solution is to make the destructor of GameObject virtual.

Avoiding object slicing

You can avoid accidental object slicing by making the base class abstract. Since there can not be concrete instances of an abstract class, you cannot accidentally "slice off" the base of a derived object.

For example:

Ball b;
GameObject g = b;

is ill-formed because GameObject is an abstract class. The compiler might say something like this:

main.cpp: In function 'int main()':
main.cpp:16:20: error: cannot allocate an object of abstract type 'GameObject'
 GameObject g = b;
                ^
main.cpp:3:7: note:   because the following virtual functions are pure within 'GameObject':
 class GameObject
       ^~~~~~~~~~
main.cpp:7:18: note:    'virtual void GameObject::Draw()'
     virtual void Draw( ) = 0;
                  ^~~~
main.cpp:16:16: error: cannot declare variable 'g' to be of abstract type 'GameObject'
     GameObject g = b;

回答5:

I will attempt to answer the various questions you have asked, although others may have a more technical explanation in their answers.

virtual void Draw( ) = 0;

Why do we need to say virtual for it?

In simple terms, the virtual keyword tells the C++ compiler that the function can be redefined in a child class. When you go to call ball.Draw() the compiler knows that Ball::Draw() should be executed if it exists in the Ball class instead of GameObject::Draw().

std::vector<GameObject*> _gameObjects;

Why would we use pointers here?

This is a good idea because object slicing does happen when the container has to allocate space for and contain the objects themselves. Remember that a pointer is a constant size, regardless of what it points to. When you have to resize the container or move elements around, the pointers are much easier and faster to move. And you can always cast a pointer to GameObject back into a pointer to Ball if you are certain that is a valid thing to do.

new returns a pointer to the object correct?

Yes, what new is doing is constructing an instance of that class on the heap and then returning a pointer to that instance.
I strongly recommend that you learn how to use smart pointers though. These can automatically delete objects when they are no longer referenced. Kind of like what a garbage collector does in a language like Java or C#.

new Ball( -1, 1, boardWidth / 2, boardHeight / 2 );

...or is that automatically assigning it as a GameObject too?

Yes, if Ball inherits the GameObject class, then a pointer to a Ball will also be a valid pointer to a GameObject. As you'd expect, you can't access the members of Ball from a pointer to a GameObject though.

Does it have to do with creating an object via new vs just Ball ball for example?

I will explain the difference between the two ways to instantiate a Ball:

Ball ballA = Ball();
Ball* ballB = new Ball();

For ballA we are declaring that the ballA variable is an instance of Ball that will "live" in the stack memory. We use the Ball() constructor to initialize the ballA variable to an instance of a Ball. Since this is a stack variable the ballA instance will be destroyed once the program exits the scope in which it is declared.

For ballB we are declaring that the ballB variable is a pointer to an instance of Ball that will live in the heap memory. We use the new Ball() statement to first allocate heap memory for a Ball and then construct it with the Ball() constructor. Finally that new statement evaluates to a pointer which is assigned to ballB. Now when the program exits the scope where ballB is declared, the pointer is destroyed but the instance it pointed to is left on the heap. If you didn't save the value of that pointer somewhere else you will be unable to free the memory used by that Ball instance. This is why smart pointers are useful because they internally keep track of whether the instance is still referenced anywhere.

回答6:

This has to do with values.

Ball b;
GameObject g;

The value of b is the different values of it's variables.

The value of g is likewise the different values of it's variables.

When b is assigned to g, the variables of a "subobject" of b (inherited from GameObject ) are assigned to variables of g. This is slicing.

Now about the functions.

To the compiler a member function of a class is a pointer to a memory where the code of the function resides.

A non-virtual function is always a constant pointer value.

But a virtual function can have different values depending on which class they were declared in.

So to tell the compiler that it should create a placeholder for the function pointer, the keyword virtual is used.

Now back to the assignment of values.

We know assigning different types of variables to each other can cause slicing. So to solve this issue indirection is used - a pointer to the object.

A pointer always needs the same amount of strorage for any pointer of a type. And when a pointer is assigned, the underlying structure is left unchanged, only the pointer is copied which overrides the previous pointer.

When we call a virtual function on g which has been sliced, we might be calling the correct function from b, but the sliced g object doesn't have all the fields needed by the b function, so an error can occur.

But calling using a pointer to the object, the original object b is used, which has all the required fields used by the virtual function of b.

来源：https://stackoverflow.com/questions/56464702/avoiding-object-slicing

标签

c++

object-slicing