Directly write into char* buffer of std::string

那年仲夏 提交于 2019-12-04 11:42:48

问题


So I have an std::string and have a function which takes char* and writes into it. Since std::string::c_str() and std::string::data() return const char*, I can't use them. So I was allocating a temporary buffer, calling a function with it and copying it into std::string.

Now I plan to work with big amount of information and copying this buffer will have a noticeable impact and I want to avoid it.

Some people suggested to use &str.front() or &str[0] but does it invoke the undefined behavior?


回答1:


C++98/03

Impossible. String can be copy on write so it needs to handle all reads and writes.

C++11/14

In [string.require]:

The char-like objects in a basic_string object shall be stored contiguously. That is, for any basic_string object s, the identity &*(s.begin() + n) == &*s.begin() + n shall hold for all values of n such that 0 <= n < s.size().

So &str.front() and &str[0] should work.

C++17

str.data(), &str.front() and &str[0] work.

Here it says:

charT* data() noexcept;

Returns: A pointer p such that p + i == &operator[](i) for each i in [0, size()].

Complexity: Constant time.

Requires: The program shall not alter the value stored at p + size().

The non-const .data() just works.

The recent draft has the following wording for .front():

const charT& front() const;

charT& front();

Requires: !empty().

Effects: Equivalent to operator[](0).

And the following for operator[]:

const_reference operator[](size_type pos) const;

reference operator[](size_type pos);

Requires: pos <= size().

Returns: *(begin() + pos) if pos < size(). Otherwise, returns a reference to an object of type charT with value charT(), where modifying the object leads to undefined behavior.

Throws: Nothing.

Complexity: Constant time.

So it uses iterator arithmetic. so we need to inspect the information about iterators. Here it says:

3 A basic_string is a contiguous container ([container.requirements.general]).

So we need to go here:

A contiguous container is a container that supports random access iterators ([random.access.iterators]) and whose member types iterator and const_iterator are contiguous iterators ([iterator.requirements.general]).

Then here:

Iterators that further satisfy the requirement that, for integral values n and dereferenceable iterator values a and (a + n), *(a + n) is equivalent to *(addressof(*a) + n), are called contiguous iterators.

Apparently, contiguous iterators are a C++17 feature which was added in these papers.

The requirement can be rewritten as:

assert(*(a + n) == *(&*a + n));

So, in the second part we dereference iterator, then take address of the value it points to, then do a pointer arithmetic on it, dereference it and it's the same as incrementing an iterator and then dereferencing it. This means that contiguous iterator points to the memory where each value stored right after the other, hence contiguous. Since functions that take char* expect contiguous memory, you can pass the result of &str.front() or &str[0] to these functions.




回答2:


You can simply use &s[0] for a non-empty string. This gives you a pointer to the start of the buffer

When you use it to put a string of n characters there the string's length (not just the capacity) needs to be at least n beforehand, because there's no way to adjust it up without clobbering the data.

I.e., usage can go like this:

auto foo( int const n )
    -> string
{
    if( n <= 0 ) { return ""; }

    string result( n, '#' );   // # is an arbitrary fill character.
    int const n_stored = some_api_function( &result[0], n );
    assert( n_stored <= n );
    result.resize( n_stored );
    return result;
}

This approach has worked formally since C++11. Before that, in C++98 and C++03, the buffer was not formally guaranteed to be contiguous. However, for the in-practice the approach has worked since C++98, the first standard – the reason that the contiguous buffer requirement could be adopted in C++11 (it was added in the Lillehammer meeting, I think that was 2005) was that there were no extant standard library implementations with a non-contiguous string buffer.


Regarding

C++17 added added non-const data() to std::string but it still says that you can't modify the buffer.

I'm not aware of any such wording, and since that would defeat the purpose of non-const data() I doubt that this statement is correct.


Regarding

Now I plan to work with big amount of information and copying this buffer will have a noticeable impact and I want to avoid it.

If copying the buffer has a noticeable impact, then you'd want to avoid inadvertently copying the std::string.

One way is to wrap it in a class that's not copyable.




回答3:


I don't know what you intend to do with that string, but if
all you need is a buffer of chars which frees its own memory automatically,
then I usually use vector<char> or vector<int> or whatever type
of buffer you need.

With v being the vector, it's guaranteed that &v[0] points to
a sequential memory which you can use as a buffer.



来源:https://stackoverflow.com/questions/39200665/directly-write-into-char-buffer-of-stdstring

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!