Expression: string iterator not dereferencable while using boost regex

萝らか妹 提交于 2019-12-11 02:59:00

问题


I want to recover all the links from a page, while executing this code I get:

Microsoft Visual C++ Debug Library

Debug Assertion Failed!

Program: C:\Users\Gandalf\Desktop\proxy\Debug\Proxy.exe File: C:\Program Files\Microsoft Visual Studio 10.0\VC\include\xstring Line: 78

Expression: string iterator not dereferencable

For information on how your program can cause an assertion failure, see the Visual C++ documentation on asserts.

(Press Retry to debug the application)

Abort Retry Ignore

void Deltacore::Client::get_links() {
boost::smatch matches;
boost::match_flag_type flags = boost::match_default;
boost::regex URL_REGEX("^<a[^>]*(http://[^\"]*)[^>]*>([ 0-9a-zA-Z]+)</a>$");

if(!response.empty()) {

    std::string::const_iterator alfa = this->response.begin();
    std::string::const_iterator omega   = this->response.end();

    while (boost::regex_search(alfa, omega, matches, URL_REGEX))
    {
        std::cout << matches[0];
        //if(std::find(this->Links.begin(), this->Links.end(), matches[0]) != this->Links.end()) {
            this->Links.push_back(matches[0]);
        //}
        alfa = matches[0].second;
    }
}
}

Any Ideea?

Added more code:

        Deltacore::Client client;
    client.get_url(target);
    client.get_links();

            boost::property_tree::ptree props;
            for(size_t i = 0; i < client.Links.size(); i++)
                props.push_back(std::make_pair(boost::lexical_cast<std::string>(i), client.Links.at(i)));

            std::stringstream ss;
            boost::property_tree::write_json(ss, props, false);

            boost::asio::async_write(socket_,
                boost::asio::buffer(ss.str(), ss.str().length()),
                boost::bind(&session::handle_write, this,
                boost::asio::placeholders::error));

Thanks in advance


回答1:


The problem is on this line:

boost::asio::buffer(ss.str(), ss.str().length())

str() returns a temporary std::string object, so you are actually invalidating the buffer as soon as you create it – vanilla UB, as I commented. ;-]

Token documentation citation:

The buffer is invalidated by any non-const operation called on the given string object.

Of course, destroying the string qualifies as a non-const operation.




回答2:


Skipping the lecture on using regex to parse HTML (and how you really shouldn't...), your regex doesn't look like it will work like you intend. This is yours:

"^<a[^>]*(http://[^\"]*)[^>]*>([ 0-9a-zA-Z]+)</a>$"

The first character class will be greedy and eat up your http and following parts. You want to add a question mark to make it not greedy.

"^<a[^>]*?(http://[^\"]*)[^>]*>([ 0-9a-zA-Z]+)</a>$"

This might or might not be related to the exception.



来源:https://stackoverflow.com/questions/11678537/expression-string-iterator-not-dereferencable-while-using-boost-regex

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!