Regex C++: extract substring

前端 未结 4 1068
陌清茗
陌清茗 2020-12-02 23:13

I would like to extract a substring between two others.
ex: /home/toto/FILE_mysymbol_EVENT.DAT
or just FILE_othersymbol_EVENT.DAT
And I

相关标签:
4条回答
  • 2020-12-02 23:50

    If you want to use regular expressions, I'd really recommend using C++11's regexes or, if you have a compiler that doesn't yet support them, Boost. Boost is something I consider almost-part-of-standard-C++.

    But for this particular question, you do not really need any form of regular expressions. Something like this sketch should work just fine, after you add all appropriate error checks (beg != npos, end != npos etc.), test code, and remove my typos:

    std::string between(std::string const &in,
                        std::string const &before, std::string const &after) {
      size_type beg = in.find(before);
      beg += before.size();
      size_type end = in.find(after, beg);
      return in.substr(beg, end-beg);
    }
    

    Obviously, you could change the std::string to a template parameter and it should work just fine with std::wstring or more seldomly used instantiations of std::basic_string as well.

    0 讨论(0)
  • 2020-12-02 23:51

    TRegexp only supports a very limited subset of regular expressions compared to other regex flavors. This makes constructing a single regex that suits your needs somewhat awkward.

    One possible solution:

    [^_]*_([^_]*)_
    

    will match the string until the first underscore, then capture all characters until the next underscore. The relevant result of the match is then found in group number 1.

    But in your case, why use a regex at all? Just find the first and second occurrence of your delimiter _ in the string and extract the characters between those positions.

    0 讨论(0)
  • 2020-12-02 23:56

    Since last year C++ has regular expression built into the standard. This program will show how to use them to extract the string you are after:

    #include <regex>
    #include <iostream>
    
    int main()
    {
        const std::string s = "/home/toto/FILE_mysymbol_EVENT.DAT";
        std::regex rgx(".*FILE_(\\w+)_EVENT\\.DAT.*");
        std::smatch match;
    
        if (std::regex_search(s.begin(), s.end(), match, rgx))
            std::cout << "match: " << match[1] << '\n';
    }
    

    It will output:

    match: mysymbol
    

    It should be noted though, that it will not work in GCC as its library support for regular expression is not very good. Works well in VS2010 (and probably VS2012), and should work in clang.


    By now (late 2016) all modern C++ compilers and their standard libraries are fully up to date with the C++11 standard, and most if not all of C++14 as well. GCC 6 and the upcoming Clang 4 support most of the coming C++17 standard as well.

    0 讨论(0)
  • 2020-12-03 00:02

    I would study corner cases before trusting it, but

       std::string text = "/home/toto/FILE_mysymbol_EVENT.DAT";
       std::regex re("(.*)(FILE_)(.*)(_EVENT.DAT)(.*)");
       std::cout << std::regex_replace(text, re, "$3") << '\n';
    

    is a good candidate.

    0 讨论(0)
提交回复
热议问题