Why are std::vector::data and std::string::data different?

左心房为你撑大大i 提交于 2019-11-27 14:17:15

In C++98/03 there was good reason to not have a non-const data() due to the fact that string was often implemented as COW. A non-const data() would have required a copy to be made if the refcount was greater than 1. While possible, this was not seen as desirable in C++98/03.

In Oct. 2005 the committee voted in LWG 464 which added the const and non-const data() to vector, and added const and non-const at() to map. At that time, string had not been changed so as to outlaw COW. But later, by C++11, a COW string is no longer conforming. The string spec was also tightened up in C++11 such that it is required to be contiguous, and there's always a terminating null exposed by operator[](size()). In C++03, the terminating null was only guaranteed by the const overload of operator[].

So in short a non-const data() looks a lot more reasonable for a C++11 string. To the best of my knowledge, it was never proposed.

Update

charT* data() noexcept;

was added basic_string in the C++1z working draft N4582 by David Sankel's P0272R1 at the Jacksonville meeting in Feb. 2016.

Nice job David!

Historically, the string data has not been const because it would prevent several common optimizations, like copy-on-write (COW). This is now, IIANM, far less common, because it behaves badly with multithreaded programs.

BTW, yes they are now required to be contiguous:

[string.require].5: The char-like objects in a basic_string object shall be stored contiguously. That is, for any basic_string object s, the identity &*(s.begin() + n) == &*s.begin() + n shall hold for all values of n such that 0 <= n < s.size().

Another reason might be to avoid code such as:

std::string ret;
strcpy(ret.data(), "whatthe...");

Or any other function that returns a preallocated char array.

Although I'm not that well-versed in the standard, it might be due to the fact that std::string doesn't need to contain null-terminated data, but it can and it doesn't need to contain an explicit length field, but it can. So changing the undelying data and e.g. adding a '\0' in the middle might get the strings length field out of sync with the actual char data and thus leave the object in an invalid state.

curiousguy

@Christian Rau

From the time the original Plauger (around 1995 I think) string class was STL-ized by the committee (turned into a Sequence, templatified), std::string has always been std::vector plus string-related stuff (conversion from/to 0-terminated, concatenation, ...), plus some oddities, like COW that's actually "Copy on Write and on non-const begin()/end()/operator[]".

But ultimately a std::string is really a std::vector under another name, with a slightly different focus and intent. So:

  • just like std::vector, std::string has either a size data member or both start and end data members;
  • just like std::vector, std::string does not care about the value of its elements, embedded NUL or others.

std::string is not a C string with syntax sugar, utility functions and some encapsulation, just like std::vector<T> is not T[] with syntax sugar, utility functions and some encapsulation.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!