About unique_ptr performances

后端 未结 3 1906
春和景丽
春和景丽 2021-01-21 01:17

I often read that unique_ptr would be preferred in most situations over shared_ptr because unique_ptr is non-copyable and has move semantics; shared_ptr would add an overhead

3条回答
  •  死守一世寂寞
    2021-01-21 01:58

    UPDATED on Jan 01, 2014

    I know this question is pretty old, but the results are still valid on G++ 4.7.0 and libstdc++ 4.7. So, I tried to find out the reason.

    What you're benchmarking here is the dereferencing performance using -O0 and, looking at the implementation of unique_ptr and shared_ptr, your results are actually correct.

    unique_ptr stores the pointer and the deleter in a ::std::tuple, while shared_ptr stores a naked pointer handle directly. So, when you dereference the pointer (using *, ->, or get) you have an extra call to ::std::get<0>() in unique_ptr. In contrast, shared_ptr directly returns the pointer. On gcc-4.7 even when optimized and inlined, ::std::get<0>() is a bit slower than the direct pointer.. When optimized and inlined, gcc-4.8.1 fully omits the overhead of ::std::get<0>(). On my machine, when compiled with -O3, the compiler generates exactly the same assembly code, which means they are literally the same.

    All in all, using the current implementation, shared_ptr is slower on creation, moving, copying and reference counting, but equally as fast *on dereferencing*.

    NOTE: print() is empty in the question and the compiler omits the loops when optimized. So, I slightly changed the code to correctly observe the optimization results:

    #include 
    #include 
    #include 
    #include 
    #include 
    
    using namespace std;
    
    class Print {
     public:
      void print() { i++; }
    
      int i{ 0 };
    };
    
    void test() {
      typedef vector> sh_vec;
      typedef vector> u_vec;
    
      sh_vec shvec;
      u_vec uvec;
    
      // can't use initializer_list with unique_ptr
      for (int var = 0; var < 100; ++var) {
        shvec.push_back(make_shared());
        uvec.emplace_back(new Print());
      }
    
      //-------------test shared_ptr-------------------------
      auto time_sh_1 = std::chrono::system_clock::now();
    
      for (auto var = 0; var < 1000; ++var) {
        for (auto it = shvec.begin(), end = shvec.end(); it != end; ++it) {
          (*it)->print();
        }
      }
    
      auto time_sh_2 = std::chrono::system_clock::now();
    
      cout << "test shared_ptr : " << (time_sh_2 - time_sh_1).count()
           << " microseconds." << endl;
    
      //-------------test unique_ptr-------------------------
      auto time_u_1 = std::chrono::system_clock::now();
    
      for (auto var = 0; var < 1000; ++var) {
        for (auto it = uvec.begin(), end = uvec.end(); it != end; ++it) {
          (*it)->print();
        }
      }
    
      auto time_u_2 = std::chrono::system_clock::now();
    
      cout << "test unique_ptr : " << (time_u_2 - time_u_1).count()
           << " microseconds." << endl;
    }
    
    int main() { test(); }
    

    NOTE: That is not a fundamental problem and can be easily fixed by discarding the use of ::std::tuple in current libstdc++ implementation.

提交回复
热议问题