问题
I would like to write something in C++ that tokenize a string. To explain what I want, take the following string:
add string "this is a string with spaces!"
This must be splitted as follows:
add
string
this is a string with spaces!
Is there a quick and standard-library-based approach?
回答1:
I guess there is no straight forward approach with standard library. Indirectly following algo will work:
a) search for '\"' with string::find('\"') . If anything found search for next '\"' using string::find('\'',prevIndex), If found use string::substr(). Discard that part from the original string.
b) Now Serach for ' ' character in the same way.
NOTE: you have to iterate through the whole string.
回答2:
No library is needed. An iteration can do the task ( if it is as simple as you describe).
string str = "add string \"this is a string with space!\"";
for( size_t i=0; i<str.length(); i++){
char c = str[i];
if( c == ' ' ){
cout << endl;
}else if(c == '\"' ){
i++;
while( str[i] != '\"' ){ cout << str[i]; i++; }
}else{
cout << c;
}
}
that outputs
add
string
this is a string with space!
回答3:
Here is a complete function for it. Modify it according to need, it adds parts of string to a vector strings(qargs).
void split_in_args(std::vector<std::string>& qargs, std::string command){
int len = command.length();
bool qot = false, sqot = false;
int arglen;
for(int i = 0; i < len; i++) {
int start = i;
if(command[i] == '\"') {
qot = true;
}
else if(command[i] == '\'') sqot = true;
if(qot) {
i++;
start++;
while(i<len && command[i] != '\"')
i++;
if(i<len)
qot = false;
arglen = i-start;
i++;
}
else if(sqot) {
i++;
while(i<len && command[i] != '\'')
i++;
if(i<len)
sqot = false;
arglen = i-start;
i++;
}
else{
while(i<len && command[i]!=' ')
i++;
arglen = i-start;
}
qargs.push_back(command.substr(start, arglen));
}
for(int i=0;i<qargs.size();i++){
std::cout<<qargs[i]<<std::endl;
}
std::cout<<qargs.size();
if(qot || sqot) std::cout<<"One of the quotes is open\n";
}
回答4:
I wonder why this simple and C++ style solution is not presented here.
It's based on fact that if we first split string by \", then each even chunk is "inside" quotes, and each odd chunk should be additionally splitted by whitespaces.
No possibility for out_of_range or anything else.
unsigned counter = 0;
std::string segment;
std::stringstream stream_input(input);
while(std::getline(stream_input, segment, '\"'))
{
++counter;
if (counter % 2 == 0)
{
if (!segment.empty())
std::cout << segment << std::endl;
}
else
{
std::stringstream stream_segment(segment);
while(std::getline(stream_segment, segment, ' '))
if (!segment.empty())
std::cout << segment << std::endl;
}
}
来源:https://stackoverflow.com/questions/18675364/c-tokenize-a-string-with-spaces-and-quotes