I\'ve been using std::regex_iterator
to parse log files. My program has been working quite nicely for some weeks and has parsed millions of log lines, until to
The regex appears to be OK; at least there is nothing in it that could cause catastrophic backtracking.
I see a small possibility to optimize the regex, cutting down on stack use:
static wregex rgx_log_lines(
L"^L(\\d+)\\s+" // Level
L"T(\\d+)\\s+" // TID
L"(\\d+)\\s+" // Timestamp
L"\\[([\\w:]+)\\]" // Function name
L"((?:" // Complex pattern
L"(?!" // Stop matching when...
L"^L\\d" // New log statement at the beginning of a line
L")"
L"[^]" // Matching all until then
L")*)" //
);
Did you set the ECMAScript option? Otherwise, I suspect the regex library defaults to POSIX regexes, and those don't support lookahead assertions.