Why are two raw pointers to the managed object needed in std::shared_ptr implementation?

问题

Here's a quote from cppreference's implementation note section of std::shared_ptr, which mentions that there are two different pointers(as shown in bold) : the one that can be returned by get(), and the one holding the actual data within the control block.

In a typical implementation, std::shared_ptr holds only two pointers:

the stored pointer (one returned by get())

a pointer to control block

The control block is a dynamically-allocated object that holds:

either a pointer to the managed object or the managed object itself

the deleter (type-erased)

the allocator (type-erased)

the number of shared_ptrs that own the managed object

the number of weak_ptrs that refer to the managed object

The pointer held by the shared_ptr directly is the one returned by get(), while the pointer or object held by the control block is the one that will be deleted when the number of shared owners reaches zero. These pointers are not necessarily equal.

My question is, why are two different pointer(the two in bold) needed for the managed object (in addition to the pointer to the control block)? Doesn't the one returned by get() suffice? And why aren't these pointers necessarily equal?

回答1:

The reason for this is that you can have a shared_ptr which points to something else than what it owns, and that is by design. This is implemented using the constructor listed as nr. 8 on cppreference:

template< class Y >
shared_ptr( const shared_ptr<Y>& r, T *ptr );

A shared_ptr created with this constructor shares ownership with r, but points to ptr. Consider this (contrived, but illustrating) code:

std::shared_ptr<int> creator()
{
  using Pair = std::pair<int, double>;

  std::shared_ptr<Pair> p(new Pair(42, 3.14));
  std::shared_ptr<int> q(p, &(p->first));
  return q;
}

Once this function exits, only a pointer to the int subobject of the pair is available to client code. But because of the shared ownership between q and p, the pointer q "keeps alive" the entire Pair object.

Once dealloacation is supposed to happen, the pointer the entire Pair object must be passed to the deleter. Hence the pointer to the Pair object must be stored somewhere alongside the deleter—in other words, in the control block.

For a less contrived example (probably even one closer to the original motivation for the feature), consider the case of pointing to a base class. Something like this:

struct Base1
{
  // :::
};

struct Base2
{
  // :::
};

struct Derived : Base1, Base2
{
 // :::
};

std::shared_ptr<Base2> creator()
{
  std::shared_ptr<Derived> p(new Derived());
  std::shared_ptr<Base2> q(p, static_cast<Base2*>(p.get()));
  return q;
}

Of course, the real implementation of std::shared_ptr has all the implicit conversions in place so that the p-and-q dance in creator is not necessary, but I've kept it there to resemble the first example.

回答2:

One inescapable need for a control block is to support weak pointers. It is not always feasible to notify all weak pointers on destruction of an object (in fact, it is almost always infeasible). Accordingly, the weak pointers need something to point at until they have all gone away. Thus, some block of memory has to hang around. That block of memory is the control block. Sometimes they may be allocated together, but allocating them separately allows you to reclaim a potentially expensive object while keeping around the cheap control block.

The general rule is that the control block persists as long as there exists a single shared pointer or weak pointer referring to it, while the object is allowed to be reclaimed the instant there are no shared pointers pointing at it.

This also allows for cases where the object is brought into shared ownership after its allocation. make_shared may be able to bundle these two concepts into one block of memory, but shared_ptr<T>(new T) must first allocate T, and then figure out how to share it after the fact. When this is undesirable, boost has a related concept of intrusive_ptr which does its reference counting directly inside the object rather than with a control block (you have to write increment and decrement operators yourself to make this work).

I have seen shared pointer implementations which do not have a control block. Instead, the shared pointers develop a linked-list between themselves. As long as the linked-list contains 1 or more shared_ptrs, the object is still alive. However, this approach is more complicated in a multithreading scenario because you have to maintain the linked list rather than just a simple ref count. Its runtime is also likely to be worse in many scenarios where you are assigning and re-assigning shared_ptrs repeatedly, because the linked-list is more heavy-weight.

It is also possible for a high-performance implementation to pool allocate the control blocks, driving the cost of using them to nearly zero.

回答3:

Let look at a std::shared_ptr<int> This is a reference counted smart pointer to an int*. Now the int* holds no reference counting information and the shared_ptr object itself cannot hold the reference counting information since it may well be destructed well before the reference count drops down to zero.

This means that we must have an intermediate object to hold the control information that is guaranteed to remain persistent until the reference count drops to zero.

Having said that,if you create shared_ptr with make_shared both the int and the control block will be created in contiguous memory making dereferencing much more efficient.

来源：https://stackoverflow.com/questions/34046070/why-are-two-raw-pointers-to-the-managed-object-needed-in-stdshared-ptr-impleme

标签

c++

pointers

c++11

shared-ptr