Memory is corrupted after pushing into the vector

ⅰ亾dé卋堺 提交于 2019-12-23 02:18:10

问题


Why the memory is being corrupted after pushing into the vector. In the below program I have a struct with a string var(it is not a pointer). I am creating a local struct object each time and assigning a string value and push to the vector. After pushing to the vector I am making changes in the local struct object. But this change is being reflected in vector struct object's string data.

    #include <iostream>
    #include <vector>
    #include <string>
    #include <memory.h>

    using namespace std;

    void PushVector(string);

    struct thread_info
    {
            int  id;
            string threadname;
            bool bval;
    };

    std::vector<thread_info> myvector;


    int main ()
    {
            PushVector("Thread1"); // valid data into vector
            PushVector("Thread2");

            struct thread_info print;

            while(!myvector.empty())
            {
                    for(unsigned int index = 0; index < myvector.size(); ++index )
                    {
                            print = myvector.at(index);
                            cout<<"id : "<<print.id<<"\nthread name : "<<print.threadname<<"\nbool value : "<<print.bval<<endl;
                    }
                    myvector.clear();
            }
            return 0;
    }

    void PushVector(const string str)
    {

            std::cout << "Push the thread name to vector\n";
            struct thread_info thread;
            thread.id = 10;
            thread.threadname = str;
            thread.bval = true;
            myvector.push_back (thread); //copying struct obj to vector
            char* p =  (char* )thread.threadname.c_str();
            memcpy(p,"Wrong", 5); //==> Memory corrupting with invalid data after push back. Is it a limitation in C++? 
            thread.threadname = "blabla";  //trying to corrupt directly to string object
    }

o/p: Push the thread name to vector
Push the thread name to vector
id : 10
thread name : Wrongd1 ==> Memory corrupted? why no blabla string?
bool value : 1
id : 10
thread name : Wrongd2 ==> Memory corrupted? why no blabla string?
bool value : 1


回答1:


tl;dr

Your memcpy() on a const pointer (which is undefined behaviour) ran afoul of copy-on-write optimization.


Yes, vector::push_back() pushes a copy of the object into the vector. So after you push_back()ed your local thread_info object, changes to the local object should not affect the object in the vector, right?

However, std::string is allowed to assume that any access to it will happen in a well-defined way. Doing a memcpy() to the (const) pointer returned by .c_str() is not well-defined.

So... let's say that std::string took a shortcut when copying the thread_info object into the vector: Instead of copying the contained data, it copied the pointer to the data, so that two std::string objects reference the same memory area.

It can defer the copying to when (and if) it actually becomes necessary, i.e. when one of the strings is written to through any of the defined functions (like string::insert() or operator+=). This is called "copy-on-write", a rather common optimization.

By casting away the const from the return value of .c_str() and running a memcpy() on it, you foiled this mechanic. Since you did not go through any of the string member functions that could have done the copy-on-write, the two objects -- which should be different -- are still pointing to the same data memory.

GDB output, with breakpoint at the last line of PushVector():

(gdb) print &thread
$3 = (thread_info *) 0x7fffffffe240
(gdb) print &myvector[0]
$4 = (thread_info *) 0x605040

The two thread_info objects are different.

(gdb) print &thread.threadname
$5 = (std::string *) 0x7fffffffe248
(gdb) print &myvector[0].threadname
$6 = (std::string *) 0x605048

The two string objects are different as well.

(gdb) print thread.threadname.c_str()
$7 = 0x605028 "Wrongd1"
(gdb) print myvector[0].threadname.c_str()
$8 = 0x605028 "Wrongd1"

But they point to the same memory area, since neither string object is aware that there has been a write access, so no actual copying of the data has taken place.




回答2:


memcpying to the result of .c_str() is wrong. Was the fact that you had to hack away the const with a cast not a hint? Which learning resource taught you to do that?

As paulm so aptly said:

Stomping over the string's memory like that will only result in tears.

std::string::c_str() returns a pointer to a constant buffer that you shall not modify; due to certain optimisations present in some toolchains (e.g. SSO in GCC < 5.0) it may not even be the strings true underlying buffer, which appears to be the case for you here.

Forget memcpy; this is not C.

At best, you could do this:

thread.threadname.resize(5);
memcpy(&thread.threadname[0], "Wrong", 5);

Or, in C++ code:

thread.threadname.resize(5);
std::copy("Wrong", "Wrong"+5, &thread.threadname[0]);

But, for reals, you should write:

thread.threadname = "Wrong";


来源:https://stackoverflow.com/questions/32988504/memory-is-corrupted-after-pushing-into-the-vector

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!