Learning C++: returning references AND getting around slicing

醉酒当歌 提交于 2019-12-21 04:04:08

问题


I'm having a devil of a time understanding references. Consider the following code:

class Animal
{
public:
    virtual void makeSound() {cout << "rawr" << endl;}
};

class Dog : public Animal
{
public:
    virtual void makeSound() {cout << "bark" << endl;}
};

Animal* pFunc()
{
    return new Dog();
}

Animal& rFunc()
{
    return *(new Dog());
}

Animal vFunc()
{
    return Dog();
}

int main()
{
    Animal* p = pFunc();
    p->makeSound();

    Animal& r1 = rFunc();
    r1.makeSound();

    Animal r2 = rFunc();
    r2.makeSound();

    Animal v = vFunc();
    v.makeSound();
}

And the results are: "bark bark rawr rawr".

In a Java way of thinking, (which has apparently corrupted my conceptualization of C++), the results would be "bark bark bark bark". I understand from my previous question that this difference is due to slicing and I now have a good understanding of what slicing is.

But let's say that I want a function that returns an Animal value that is really a Dog.

  1. Do I understand correctly that the closest that I can get is a reference?
  2. Furthermore, is it incumbent upon the one using the rFunc interface to see that the reference returned is assign an Animal&? (Or otherwise intentionally assign the reference to an Animal which, via slicing, discards polymorphism.)
  3. How on earth am I supposed to return a reference to a newly generated object without doing the stupid thing I did above in rFunc? (At least I've heard this is stupid.)

Update: since everyone seems to agree so far that rFunc it illegitimate, that brings up another related questions:

If I pass back a pointer how do I communicate to the programmer that the pointer is not theirs to delete if this is the case? Or alternatively how do I communicate that the pointer is subject to deletion at any time (from the same thread but a different function) so that the calling function should not store it, if this is the case. Is the only way to communicate this through comments? That seems sloppy.

Note: All this is leading up to an idea for a templated shared_pimpl concept I was working on. Hopefully I'll learn enough to post something about that in a couple of days.


回答1:


1) If you're creating new objects, you never want to return a reference (see your own comment on #3.) You can return a pointer (possibly wrapped by std::shared_ptr or std::auto_ptr). (You could also return by copy, but this is incompatible with using the new operator; it's also slightly incompatible with polymorphism.)

2) rFunc is just wrong. Don't do that. If you used new to create the object, then return it through an (optionally wrapped) pointer.

3) You're not supposed to. That is what pointers are for.


EDIT (responding to your update:) It's hard to picture the scenario you're describing. Would it be more accurate to say that the returned pointer may be invalid once the caller makes a call to some other (specific) method?

I'd advise against using such a model, but if you absolutely must do this, and you must enforce this in your API, then you probably need to add a level of indirection, or even two. Example: Wrap the real object in a reference-counted object which contains the real pointer. The reference-counted object's pointer is set to null when the real object is deleted. This is ugly. (There may be better ways to do it, but they may still be ugly.)




回答2:


To answer the second part of your question ("how do I communicate that the pointer is subject to deletion at any time") -

This is a dangerous practice, and has subtle details you will need to consider. It is racy in nature.

If the pointer can be deleted at any point in time, it is never safe to use it from another context, because even if you check "are you still valid?" every time, it may be deleted just a tiny bit after the check, but before you get to use it.

A safe way to do these things is the "weak pointer" concept - have the object be stored as a shared pointer (one level of indirection, can be released at any time), and have the returned value be a weak pointer - something that you must query before you can use, and must release after you've used it. This way as long the object is still valid, you can use it.

Pseudo code (based on invented weak and shared pointers, I'm not using Boost...) -

weak< Animal > animalWeak = getAnimalThatMayDisappear();
// ...
{
    shared< Animal > animal = animalWeak.getShared();
    if ( animal )
    {
        // 'animal' is still valid, use it.
        // ...
    }
    else
    {
        // 'animal' is not valid, can't use it. It points to NULL.
        // Now what?
    }
}
// And at this point the shared pointer of 'animal' is implicitly released.

But this is complex and error prone, and would likely make your life harder. I'd recommend going for simpler designs if possible.




回答3:


In order to avoid slicing you have to return or pass around a pointer to the object. (Note that a reference is basically a 'permanently dereferenced pointer'.

Animal r2 = rFunc();
r2.makeSound();

Here, r2 is ting instantiated (using the compiler generated copy ctor) but it's leaving off the Dog parts. If you do it like this the slicing won't occur:

Animal& r2 = rFunc();

However your vFunc() function slices inside the method itself.

I'll also mention this function:

Animal& rFunc()
{
    return *(new Dog());
}

It's weird and unsafe; you're creating a reference to a temporary unnamed variable (dereferenced Dog). It's more appropriate to return the pointer. Returning references is normally used to return member variables and so on.




回答4:


If I pass back a pointer how do I communicate to the programmer that the pointer is not theirs to delete if this is the case? Or alternatively how do I communicate that the pointer is subject to deletion at any time (from the same thread but a different function) so that the calling function should not store it, if this is the case.

If you really can't trust the user, then don't give them a pointer at all: pass back an integer-type handle and expose a C-style interface (e.g., you have a vector of instances on your side of the fence, and you expose a function that takes the integer as the first parameter, indexes into the vector and calls a member function). That's the old-fashioned way (notwithstanding that we didn't always have fancy things like "member functions" ;) ).

Otherwise, try using a smart pointer with appropriate semantics. Nobody sane would ever think that delete &*some_boost_shared_ptr; is a good idea.




回答5:


But let's say that I want a function that returns an Animal value that is really a Dog.

  1. Do I understand correctly that the closest that I can get is a reference?

Yes, you are correct. But I think the problem isn't so much that you don't understand references, but that you don't understand the different types of variables in C++ or how new works in C++. In C++, variables can be an primitive data (int,float,double,etc.), an object, or a pointer/reference to a primitive and/or object. In Java, variables can only be a primitive or a reference to an object.

In C++, when you declare a variable, actual memory is allocated and associated with the variable. In Java, you have to explicitly create objects using new and explicitly assign the new object to a variable. The key point here though is that, in C++, the object and the variable you use to access are not the same thing when the variable is a pointer or reference. Animal a; means something different from Animal *a; which means something different from Animal &a;. None of these have compatible types, and they are not interchangeable.

When you type, Animal a1 in C++. A new Animal object is created. So, when you type Animal a2 = a1;, you end up with two variables (a1 and a2) and two Animal objects at different location in memory. Both objects have the same value, but you can change their values independently if you want. In Java, if you typed the same exact code, you'd end up with two variables, but only one object. As long as you didn't reassign either of the variables, they would always have the same value.

  1. Furthermore, is it incumbent upon the one using the rFunc interface to see that the reference returned is assign an Animal&? (Or otherwise intentionally assign the reference to an Animal which, via slicing, discards polymorphism.)

When you use references and pointers, you can access an object's value without copying it to where you want to use it. That allows you to change it from outside the curly braces where you declared the object into existence. References are generally used as function parameters or to return an object's private data members without making a new copy of them. Typically, when you recieve a reference, you don't assign it to anything. Using your example, instead of assigning the reference returned by rFunc() to a variable, one would normally type rFunc().makeSound();.

So, yes, it is incumbent on the user of rFunc(), if they assign the return value to anything, to assign it to a reference. You can see why. If you assign the reference returned by rFunc() to a variable declared as Animal animal_variable, you end up with one Animal variable, one Animal object, and one Dog object. The Animal object associated with animal_variable is, as much as possible, a copy of the Dog object that was returned by reference from rFunc(). But, you can't get polymorphic behavior from animal_variable because that variable isn't associated with a Dog object. The Dog object that was returned by reference still exists because you created it using new, but it is no longer accessible--it was leaked.

  1. How on earth am I supposed to return a reference to a newly generated object without doing the stupid thing I did above in rFunc? (At least I've heard this is stupid.)

The problem is that you can create an object in three ways.

{ // the following expressions evaluate to ...  
 Animal local;  
 // an object that will be destroyed when control exits this block  
 Animal();  
 // an unamed object that will be destroyed immediately if not bound to a reference  
 new Animal();  
 // an unamed Animal *pointer* that can't be deleted unless it is assigned to a Animal pointer variable.  
 {  
  // doing other stuff
 }  
} // <- local destroyed

All new does in C++ is create objects in memory where it won't be destroyed until you say so. But, in order to destroy it, you have to remember where it was created at in memory. You do that by creating a pointer variable, Animal *AnimalPointer;, and assigning the pointer returned by new Animal() to it, AnimalPointer = new Animal();. To destroy the Animal object when you are done with it, you have to type delete AnimalPointer;.




回答6:


Point 1: do not use references. Use pointers.

Point 2: the thing you have above is called a Taxonomy which is hierarchical classification scheme. Taxonomies are the exemplar of a kind which is utterly unsuitable for object oriented modelling. Your trivial example only works because your base Animal assumes all animals make a noise, and can't do anything else interesting.

If you try to implement a relation, such as

virtual bool Animal::eats(Animal *other)=0;

you will find you cannot do it. The thing is: Dog is not a subtype of Animal abstraction. The whole point of Taxonomies is that the classes of each level of the partition have new an interesting properties.

For example: Vertebrates have a backbone and we can ask whether it is made of cartiledge or bone.. we can't even ask that question of Invertebrates.

To fully understand, you must see that you cannot make a Dog object. After all, it's an abstraction, right? Because, there are Kelpies and Collies, and an individual Dog has to be of some species .. the classification scheme can be as deep as you like but it can never support any concrete individuals. Fido is not-a-Dog, that's just his classification tag.




回答7:


(I'm ignoring your problems with dynamic memory going into references causing memory leaks... )

Your splitting problems go away when Animal is an abstract base class. That means it has at least one pure virtual method and cannot be directly instantiated. The following becomes a compiler error:

Animal a = rFunc();   // a cannot be directly instantiated
                      // spliting prevented by compiler!

but the compiler allows:

Animal* a = pFunc();  // polymorphism maintained!
Animal& a = rFunc();  // polymorphism maintained!

Thus the compiler saves the day!




回答8:


If you want to return a polymorphic type from a method and don't want to assign it on the heap you could consider making it a field in that method's class and making the function to return a pointer to it of whatever base class you want.



来源:https://stackoverflow.com/questions/4405634/learning-c-returning-references-and-getting-around-slicing

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!