Why is it unsafe to call push_back in a for loop even when there is no memory reallocation?

问题

When I read this post: https://stackoverflow.com/a/42448319/3336423

I understand, that calling push_back in a for loop is unsafe because:

If the new size() is greater than capacity() then all iterators and references (including the past-the-end iterator) are invalidated. Otherwise only the past-the-end iterator is invalidated.

Then, I assume that if I guarantee the capacity won't be exceeded, it would be safe....

But apparently not, all implementations below will crash after first push_back is called (using Visual Studio), even if I can see that capacity remains unchanged within the loop (so I assume the vector does not reallocate its memory):

Version1:

std::vector<int> v1{ 3, 4, 5 };
v1.reserve( v1.size()*2 );
size_t c1 = v1.capacity();
for ( auto val : v1 )
{
    v1.push_back( val );
    c1 = v1.capacity();
}

Version2:

std::vector<int> v2{ 3, 4, 5 };
v2.reserve( v2.size()*2 );
size_t c2 = v2.capacity();
auto curEnd = v2.end();
for ( auto iter = v2.begin(); iter != curEnd; ++iter )
{
    v2.push_back( *iter );
    c2 = v2.capacity();
}

Version3:

std::vector<int> v3{ 3, 4, 5 };
v3.reserve( v3.size()*2 );
size_t c3 = v3.capacity();
for ( auto iter = v3.begin(); iter != v3.end(); ++iter )
{
    v3.push_back( *iter );
    c3 = v3.capacity();
}

What makes those code crash?

回答1:

Both your first two versions have the same problem. The cached past-the-end iterator is invalidated after the first insertion, making any subsequent use of it UB. Yes, even just comparing against it.

Your third sample crashes because it ends up trying to re-insert the freshly inserted elements. So it ends up needing a reallocation, which ends up causing more UB.

The correct iterator-based approach is to not go until one past the end. But until the end, assuming the vector is not empty.

std::vector<int> v{ 3, 4, 5 };
v.reserve( v.size()*2 );

auto it = v.begin(), end = v.end();
--end;
do {
  v.push_back(*it);
} while (it++ != end); // Check against current position, but increment still

回答2:

Same quote with my emphasis:

If the new size() is greater than capacity() then all iterators and references (including the past-the-end iterator) are invalidated. Otherwise only the past-the-end iterator is invalidated.

curEnd of Version2 is past-the-end iterator and is therefore invalidated. Version1 does something similar under the syntactic sugar.

Version3 will continue iterating the newly inserted elements until reallocation happens. Even if there was no reallocation, it would loop indefinitely until all memory is exhausted.

A simple solution to modify Version3 by storing the old size (count), and loop until begin + count

std::vector<int> v3{ 3, 4, 5 };
auto count = v3.size();
v3.reserve( v3.size()*2 );
size_t c3 = v3.capacity();
for ( auto iter = v3.begin(); iter != v3.begin() + count; ++iter )
{
    v3.push_back( *iter );
    c3 = v3.capacity();
}

Another is to use the old fashioned indexed loop.

回答3:

If the new size() is greater than capacity() then all iterators and references (including the past-the-end iterator) are invalidated. Otherwise only the past-the-end iterator is invalidated.

In both cases, the end iterator is stored in a variable only once, and so as soon as you add an element, that iterator is invalid and using it is undefined behavior.

You need to get the new end iterator in each loop iteration, as you did in Version 3. But that one crashes too!

For the exact same reason. Your loop is actually infinite and will never stop allocating memory for objects. Because the capacity is not infinite, you will get a reallocation and then you have undefined behavior.

You are always adding one element to the vector, and since the loop stops when it went through each element, it never stops. It's a bit like

int end = 1;
for (int i = 0; i < end; ++i) // infinite!
  ++end;

You can always use an index based loop instead if you want to add the three elements again. Or more even, given that you reserve enough storage. :)

来源：https://stackoverflow.com/questions/48518970/why-is-it-unsafe-to-call-push-back-in-a-for-loop-even-when-there-is-no-memory-re

标签

c++

for-loop

push-back