Why unique_ptr and shared_ptr do not invalidate the pointer they are constructed from?

问题

A note: this is an API design question, riding on the design of the constructors of unique_ptr and share_ptr for the sake of the question, but not aiming to propose any change to their current specifications.

Though it would usually be advisable to use make_unique and make_shared, both unique_ptr and shared_ptr can be constructed from a raw pointer.

Both get the pointer by value and copy it. Both allow (i.e. in the sense of: do not prevent) a continuance usage of the original pointer passed to them in the constructor.

The following code compiles and results with double free:

int* ptr = new int(9);
std::unique_ptr<int> p { ptr };
// we forgot that ptr is already being managed
delete ptr;

Both unique_ptr and shared_ptr could prevent the above if their relevant constructors would expect to get the raw pointer as an rvalue, e.g. for unique_ptr:

template<typename T>
class unique_ptr {
  T* ptr;
public:
  unique_ptr(T*&& p) : ptr{p} {
    p = nullptr; // invalidate the original pointer passed
  }
  // ...

Thus, the original code would not compile as an lvalue cannot bind to an rvalue, but using std::move the code compiles, while being more verbose and more secured:

int* ptr = new int(9);
std::unique_ptr<int> p { std::move(ptr) };
if (!ptr) {
  // we are here, since ptr was invalidated
}

It is clear that there can be dozens of other bugs a user can do with smart pointers. The commonly used argument of you should know how to properly use the tools provided by the language, and C++ is not designed to watch over you etc.

But still, it seems that there could have been an option for preventing this simple bug and to encourage usage of make_shared and make_unique. And even before make_unique was added in C++14, there is still always the option of direct allocation without a pointer variable, as:

auto ptr = std::unique_ptr<int>(new int(7));

It seems that requesting rvalue reference to a pointer as the constructor parameter could add a bit of an extra safety. Moreover, the semantics of getting rvalue seems to be more accurate as we take ownership of the pointer that is passed.

Coming to the question of why didn't the standard take this more secured approach?

A possible reason might be that the approach suggested above would prevent creating a unique_ptr from const pointers, i.e. the following code would fail to compile with the proposed approach:

int* const ptr = new int(9);
auto p = std::unique_ptr { std::move(ptr) }; // cannot bind `const rvalue` to `rvalue`

But this seems to be a rare scenario worth neglecting, I believe.

Alternatively, in case the need to support initialization from a const pointer is a strong argument against the proposed approach, then a smaller step could still be achieved with:

unique_ptr(T* const&& p) : ptr{p} {
    // ...without invalidating p, but still better semantics?
}

回答1:

As long as they both have a get() method, invalidating the original pointer is not a "more secured", but a more confusing approach.

In the way raw pointers are used in C++, they don't represent any ownership concept by themselves and don't define the object lifetime. Anyone who uses raw pointers in C++ needs to be aware that they are only valid while the object exists (whether the object lifetime is enforced by smart pointers or by the logic of the program). Creating a smart pointer from a raw pointer does not "transfer" ownership of the object, but "assigns" it.

This distinction could be made clear in the following example:

std::unique_ptr<int> ptr1 = std::make_unique<int>(1);
int* ptr2 = ptr1.get();
// ...
// somewhere later:
std::unique_ptr<int> ptr3(ptr2);
// or std::unique_ptr<int> ptr3(std::move(ptr2));

In this case, the ownership of the int object is not transferred to the ptr3. It is erroneously assigned to it without releasing the ownership by ptr1. A programmer needs to be aware where the pointer passed to the std::unique_ptr constructor comes from, and how the ownership of the object has been enforced so far. An assurance by the library that it will invalidate this particular pointer variable may give the programmer a false sense of security, but does not provide any real safety.

回答2:

I think the answer is simple: zero overhead. This change isn't needed for unique_ptr to be functional, so the standard doesn't require it. If you think this improves safety enough to worth it, you can ask your implementation to add it (maybe under a special compilation flag).

BTW, I'd expect static analyzers to know enough to warn against this code pattern.

来源：https://stackoverflow.com/questions/60535023/why-unique-ptr-and-shared-ptr-do-not-invalidate-the-pointer-they-are-constructed

标签

c++

shared-ptr

api-design

unique-ptr

rvalue-reference