问题
I have a long string and I need to convert digits to words (ex. 5 to five). Can I do this with a regex? I tried using regex_replace
, but this changed all the numbers to the one that was found first (ex. it converted "5 10 1 0" to "five five five five", but I need "five ten one zero").
This was my attempt:
string text ="a lot of text";
regex pattern("(\\d)+");
smatch result;
int x; string buffer;
while (regex_search(text, result, pattern))
{
buffer = result[0];
x = atoi(buffer.c_str());
switch (x)
{
case 0: text = regex_replace(text, pattern, numbers[0]); break;
case 1: text = regex_replace(text, pattern, numbers[1]); break;
case 2: text = regex_replace(text, pattern, numbers[2]); break;
case 3: text = regex_replace(text, pattern, numbers[3]); break;
case 4: text = regex_replace(text, pattern, numbers[4]); break;
case 5: text = regex_replace(text, pattern, numbers[5]); break;
case 6: text = regex_replace(text, pattern, numbers[6]); break;
case 7: text = regex_replace(text, pattern, numbers[7]); break;
case 8: text = regex_replace(text, pattern, numbers[8]); break;
case 9: text = regex_replace(text, pattern, numbers[9]); break;
case 10: text = regex_replace(text, pattern, numbers[10]); break;
}
text = result.suffix().str();
}
回答1:
The reason for what you are getting "five five five" is because 5 is the first match in the regex search but since your pattern is \d+ (every digit) it will replace all the matches with "five"
So you could simply do a regex_replace for each digit you want to replace
#include <array>
#include <iostream>
#include <string>
#include <regex>
using namespace std;
int main(int, char**) {
auto numbers = array{"zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten"};
string text = "6 18 2 3 4 5 2 0 0 1 4 10 19 9 1nin1ja xd3 10";
for (size_t i = 0; i < numbers.size(); ++i) {
regex pattern("\\b" + std::to_string(i) + "\\b");
text = regex_replace(text, pattern, numbers[i]);
}
cout << text << endl; //six 18 two three four five two zero zero one four ten 19 nine 1nin1ja xd3 ten
return 0;
}
回答2:
std::regex_replace
replaces all occurrences of the regular expression, so on the first invocation it will replace all digits with the first match.
You need to instead iterate over the matches and append the right replacement to the output.
Something like this:
#include <iostream>
#include <string>
#include <regex>
int main() {
std::string text = "before 5 10 1 11 after";
std::string numbers[] = { "zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten" };
std::regex pattern("\\d+");
std::string result;
std::smatch match;
auto begin = text.cbegin();
while (std::regex_search(begin, text.cend(), match, pattern)) {
result += match.prefix(); // copy the substring before the match
int x = std::stoi(match[0]);
if (x >= 0 && x <= 10) {
result += numbers[x];
} else {
result += match[0]; // a number but out-of-range - copy it as-is
}
begin += match.position() + match.length();
}
result += match.suffix(); // copy the substring after the last match
if (result.empty()) {
result = text; // special case - nothing matched
}
std::cout << result << std::endl;
}
Unlike other solutions, this one iterates over the input only once, so it will be much faster, especially on long strings.
回答3:
The following code will do what you are asking. It isn't terribly efficient because it finds the number twice (once with std::regex_search
, once with std::string::find
) but it will replace the digits 0 - 10 with words zero - ten.
#include <iostream>
#include <regex>
#include <string>
int main()
{
std::vector<std::string> numbers {
"zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "ten" };
std::string text = "These numbers: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 are text now";
std::string tmp_text = text;
std::regex pattern("(\\d)+");
std::smatch result;
int n;
std::string buffer;
std::size_t pos;
while (std::regex_search(tmp_text, result, pattern))
{
buffer = result[0];
n = atoi(buffer.c_str());
switch (n)
{
case 0:
pos = text.find('0');
text.replace(pos, 1, numbers[0]);
break;
case 1:
pos = text.find('1');
text.replace(pos, 1, numbers[1]);
break;
case 2:
pos = text.find('2');
text.replace(pos, 1, numbers[2]);
break;
case 3:
pos = text.find('3');
text.replace(pos, 1, numbers[3]);
break;
case 4:
pos = text.find('4');
text.replace(pos, 1, numbers[4]);
break;
case 5:
pos = text.find('5');
text.replace(pos, 1, numbers[5]);
break;
case 6:
pos = text.find('6');
text.replace(pos, 1, numbers[6]);
break;
case 7:
pos = text.find('7');
text.replace(pos, 1, numbers[7]);
break;
case 8:
pos = text.find('8');
text.replace(pos, 1, numbers[8]);
break;
case 9:
pos = text.find('9');
text.replace(pos, 1, numbers[9]);
break;
case 10:
pos = text.find("10");
text.replace(pos, 2, numbers[10]);
break;
}
tmp_text = result.suffix().str();
}
std::cout << text << std ::endl;
}
// output:
// These numbers: zero, one, two, three, four, five, six, seven, eight, nine, ten are text now
来源:https://stackoverflow.com/questions/58563492/replace-each-item-differently-regex-c