Why does std::regex_iterator cause a stack overflow with this data?

前端 未结 2 927
猫巷女王i
猫巷女王i 2021-01-18 05:42

I\'ve been using std::regex_iterator to parse log files. My program has been working quite nicely for some weeks and has parsed millions of log lines, until to

2条回答
  •  轮回少年
    2021-01-18 06:18

    The regex appears to be OK; at least there is nothing in it that could cause catastrophic backtracking.

    I see a small possibility to optimize the regex, cutting down on stack use:

    static wregex rgx_log_lines(
        L"^L(\\d+)\\s+"             // Level
        L"T(\\d+)\\s+"              // TID
        L"(\\d+)\\s+"               // Timestamp
        L"\\[([\\w:]+)\\]"          // Function name
        L"((?:"                     // Complex pattern
          L"(?!"                    // Stop matching when...
            L"^L\\d"                // New log statement at the beginning of a line
          L")"                      
          L"[^]"                    // Matching all until then
        L")*)"                      // 
        );
    

    Did you set the ECMAScript option? Otherwise, I suspect the regex library defaults to POSIX regexes, and those don't support lookahead assertions.

提交回复
热议问题