Why does libc++'s implementation of std::string take up 3x memory as libstdc++?

后端 未结 4 1847
轻奢々
轻奢々 2020-11-30 07:09

Consider the following test program:

#include 
#include 
#include 

int main()
{
    std::cout << sizeof(st         


        
4条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2020-11-30 07:19

    Summary: It only looks like libstdc++ uses one char*. In fact, it allocates more memory.

    So, you should not be concerned that Clang's libc++ implementation is memory inefficient.

    From the documentation of libstdc++ (under Detailed Description):

    A string looks like this:
    
                                            [_Rep]
                                            _M_length
       [basic_string]            _M_capacity
       _M_dataplus                          _M_refcount
       _M_p ---------------->               unnamed array of char_type
    

    Where the _M_p points to the first character in the string, and you cast it to a pointer-to-_Rep and subtract 1 to get a pointer to the header.

    This approach has the enormous advantage that a string object requires only one allocation. All the ugliness is confined within a single pair of inline functions, which each compile to a single add instruction: _Rep::_M_data(), and string::_M_rep(); and the allocation function which gets a block of raw bytes and with room enough and constructs a _Rep object at the front.

    The reason you want _M_data pointing to the character array and not the _Rep is so that the debugger can see the string contents. (Probably we should add a non-inline member to get the _Rep for the debugger to use, so users can check the actual string length.)

    So, it just looks like one char* but that is misleading in terms of memory usage.

    Previously libstdc++ basically used this layout:

      struct _Rep_base
      {
        size_type               _M_length;
        size_type               _M_capacity;
        _Atomic_word            _M_refcount;
      };
    

    That is closer to the results from libc++.

    libc++ uses "short string optimization". The exact layout depends on whether _LIBCPP_ABI_ALTERNATE_STRING_LAYOUT is defined. If it is defined, the data pointer will be word-aligned if the string is short. For details, see the source code.

    Short string optimization avoids heap allocations, so it also looks more costly than libstdc++ implementation if you only consider the parts that are allocated on the stack. sizeof(std::string) only shows the stack usage not the overall memory usage (stack + heap).

提交回复
热议问题