boost string matching DFA

拜拜、爱过 提交于 2019-11-30 22:59:15

Have you considered Spirit? Of course you didn't specify how you detect suffixes in context (do you require them at the end, do you require some grammar preceding it etc.) but you could do something like this:

    x3::symbols<Char> sym;
    sym += "foo", "bar", "qux";

It builds a Trie, which is pretty effective. It can parse any kind of input iterator (including streams if you are so inclined). Just add a bit of magic constrain for contextual requirements, like end-of-input:

bool has_suffix(string_view sv) {
    return parse(sv.cbegin(), sv.cend(), x3::seek[suffix >> x3::eoi]);
}

If you even wish to return the text value of the string, simply do this:

string_view get_suffix(string_view sv) {
    boost::iterator_range<string_view::const_iterator> output;
    parse(sv.cbegin(), sv.cend(), x3::seek[x3::raw[suffix >> x3::eoi]], output);
    return {output.begin(), output.size()};
}

Spirit leaves you a lot of freedom to surround with smarts, dynamically add/remove symbols, e.g. use no_case with the Trie etc.

Full Demo

Using X3 (c++14)

Live On Coliru

#include <boost/spirit/home/x3.hpp>
#include <string_view>
#include <cstdint>

namespace Demo {
    using Char = char32_t;
    using string_view = std::basic_string_view<Char>;

    namespace x3 = boost::spirit::x3;

    static auto const suffix = [] {
        x3::symbols<Char> sym;
        sym += "foo", "bar", "qux";

        return sym; // x3::no_case[sym];
    }();

    bool has_suffix(string_view sv) {
        return parse(sv.cbegin(), sv.cend(), x3::seek[suffix >> x3::eoi]);
    }

    string_view get_suffix(string_view sv) {
        boost::iterator_range<string_view::const_iterator> output;
        parse(sv.cbegin(), sv.cend(), x3::seek[x3::raw[suffix >> x3::eoi]], output);
        return {output.begin(), output.size()};
    }
}

#include <iostream>
#include <iomanip>

int main() {
    using namespace Demo;

    auto widen = [](string_view sv) { return std::wstring(sv.begin(), sv.end()); };
    std::wcout << std::boolalpha;

    for (string_view testcase : { U"nope", U"lolbar you betqux" }) {
        std::wcout 
            << widen(testcase) 
            << L" -> " << has_suffix(testcase)
            << L" (" << widen(get_suffix(testcase))
            << L")\n";
    }
}

Prints

nope -> false ()
lolbar you betqux -> true (qux)

Spirit Qi Version

A literal port: Live On Coliru

A C++11 only version: Live On Coliru

And a C++03 version for the really retro programming experience: Live On Coliru

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!