C++11 regex matching capturing group multiple times

后端 未结 1 618
挽巷
挽巷 2020-12-21 11:54

Could someone please help me to extract the text between the : and the ^ symbols using a JavaScript (ECMAScript) regular expression in C++11. I do not need to capture the <

相关标签:
1条回答
  • 2020-12-21 12:27

    With std::regex, you cannot keep mutliple repeated captures when matching a certain string with consecutive repeated patterns.

    What you may do is to match the overall texts containing the prefix and the repeated chunks, capture the latter into a separate group, and then use a second smaller regex to grab all the occurrences of the substrings you want separately.

    The first regex here may be

    hw-descriptor((?::[pmu][^^]*\\^)+)
    

    See the online demo. It will match hw-descriptor and ((?::[pmu][^^]*\\^)+) will capture into Group 1 one or more repetitions of :[pmu][^^]*\^ pattern: :, p/m/u, 0 or more chars other than ^ and then ^. Upon finding a match, use :[pmu][^^]*\^ regex to return all the real "matches".

    C++ demo:

    static const std::regex gRegex("hw-descriptor((?::[pmu][^^]*\\^)+)", std::regex::icase);
    static const std::regex lRegex(":[pmu][^^]*\\^", std::regex::icase);
    std::string foo = "hw-descriptor:pTEXT1^:mTEXT2^:uTEXT3^ hw-descriptor:pTEXT8^:mTEXT8^:uTEXT83^";
    std::smatch smtch;
    for(std::sregex_iterator i = std::sregex_iterator(foo.begin(), foo.end(), gRegex);
                             i != std::sregex_iterator();
                             ++i)
    {
        std::smatch m = *i;
        std::cout << "Match value: " << m.str() << std::endl;
        std::string x = m.str(1);
        for(std::sregex_iterator j = std::sregex_iterator(x.begin(), x.end(), lRegex);
                             j != std::sregex_iterator();
                             ++j)
        {
            std::cout << "Element value: " << (*j).str() << std::endl;
        }
    }
    

    Output:

    Match value: hw-descriptor:pTEXT1^:mTEXT2^:uTEXT3^
    Element value: :pTEXT1^
    Element value: :mTEXT2^
    Element value: :uTEXT3^
    Match value: hw-descriptor:pTEXT8^:mTEXT8^:uTEXT83^
    Element value: :pTEXT8^
    Element value: :mTEXT8^
    Element value: :uTEXT83^
    
    0 讨论(0)
提交回复
热议问题