What's the overhead from shared_ptr being thread-safe?

我的梦境 提交于 2019-12-03 14:01:25

If we check out cppreference page for std::shared_ptr they state the following in the Implementation notes section:

To satisfy thread safety requirements, the reference counters are typically incremented and decremented using std::atomic::fetch_add with std::memory_order_relaxed.

It is interesting to note an actual implementation, for example the libstdc++ implementation document here says:

For the version of shared_ptr in libstdc++ the compiler and library are fixed, which makes things much simpler: we have an atomic CAS or we don't, see Lock Policy below for details.

The Selecting Lock Policy section says (emphasis mine):

There is a single _Sp_counted_base class, which is a template parameterized on the enum __gnu_cxx::_Lock_policy. The entire family of classes is parameterized on the lock policy, right up to __shared_ptr, __weak_ptr and __enable_shared_from_this. The actual std::shared_ptr class inherits from __shared_ptr with the lock policy parameter selected automatically based on the thread model and platform that libstdc++ is configured for, so that the best available template specialization will be used. This design is necessary because it would not be conforming for shared_ptr to have an extra template parameter, even if it had a default value. The available policies are:

[...]

3._S_Single

This policy uses a non-reentrant add_ref_lock() with no locking. It is used when libstdc++ is built without --enable-threads.

and further says (emphasis mine):

For all three policies, reference count increments and decrements are done via the functions in ext/atomicity.h, which detect if the program is multi-threaded. If only one thread of execution exists in the program then less expensive non-atomic operations are used.

So at least in this implementation you don't pay for what you don't use.

At least in the boost code on i386, boost::shared_ptr was implemented using an atomic CAS operation. This meant that while it has some overhead, it is quite low. I'd expect any implementation of std::shared_ptr to be similar.

In tight loops in high performance numerical code I found some speed-ups by switching to raw pointers and being really careful. But for normal code - I wouldn't worry about it.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!