C++ Tokenize a string with spaces and quotes

醉酒当歌 提交于 2019-12-17 20:25:19

问题


I would like to write something in C++ that tokenize a string. To explain what I want, take the following string:

add string "this is a string with spaces!"

This must be splitted as follows:

add
string
this is a string with spaces!

Is there a quick and standard-library-based approach?


回答1:


I guess there is no straight forward approach with standard library. Indirectly following algo will work:

a) search for '\"' with string::find('\"') . If anything found search for next '\"' using string::find('\'',prevIndex), If found use string::substr(). Discard that part from the original string.

b) Now Serach for ' ' character in the same way.

NOTE: you have to iterate through the whole string.




回答2:


No library is needed. An iteration can do the task ( if it is as simple as you describe).

string str = "add string \"this is a string with space!\"";

for( size_t i=0; i<str.length(); i++){

    char c = str[i];
    if( c == ' ' ){
        cout << endl;
    }else if(c == '\"' ){
        i++;
        while( str[i] != '\"' ){ cout << str[i]; i++; }
    }else{
        cout << c;
    }
}

that outputs

add
string
this is a string with space!



回答3:


Here is a complete function for it. Modify it according to need, it adds parts of string to a vector strings(qargs).

void split_in_args(std::vector<std::string>& qargs, std::string command){
        int len = command.length();
        bool qot = false, sqot = false;
        int arglen;
        for(int i = 0; i < len; i++) {
                int start = i;
                if(command[i] == '\"') {
                        qot = true;
                }
                else if(command[i] == '\'') sqot = true;

                if(qot) {
                        i++;
                        start++;
                        while(i<len && command[i] != '\"')
                                i++;
                        if(i<len)
                                qot = false;
                        arglen = i-start;
                        i++;
                }
                else if(sqot) {
                        i++;
                        while(i<len && command[i] != '\'')
                                i++;
                        if(i<len)
                                sqot = false;
                        arglen = i-start;
                        i++;
                }
                else{
                        while(i<len && command[i]!=' ')
                                i++;
                        arglen = i-start;
                }
                qargs.push_back(command.substr(start, arglen));
        }
        for(int i=0;i<qargs.size();i++){
                std::cout<<qargs[i]<<std::endl;
        }
        std::cout<<qargs.size();
        if(qot || sqot) std::cout<<"One of the quotes is open\n";
}



回答4:


I wonder why this simple and C++ style solution is not presented here. It's based on fact that if we first split string by \", then each even chunk is "inside" quotes, and each odd chunk should be additionally splitted by whitespaces.

No possibility for out_of_range or anything else.

unsigned counter = 0;
std::string segment;
std::stringstream stream_input(input);
while(std::getline(stream_input, segment, '\"'))
{
    ++counter;
    if (counter % 2 == 0)
    {
        if (!segment.empty())
            std::cout << segment << std::endl;
    }
    else
    {
        std::stringstream stream_segment(segment);
        while(std::getline(stream_segment, segment, ' '))
            if (!segment.empty())
                std::cout << segment << std::endl;
    }
}


来源:https://stackoverflow.com/questions/18675364/c-tokenize-a-string-with-spaces-and-quotes

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!