Parsing nested key value pairs in Boost Spirit

一世执手 提交于 2019-12-03 15:22:02

Indeed this precisely a simple grammar that Spirit excels at.

Moreover there is absolutely no need to skip whitespace up front: Spirit has skippers built in for the purpose.

To your explicit question, though:

The Sequence rule is overcomplicated. You could just use the list operator (%):

Sequence = Pair % char_(";&");

Now your problem is that you end the sequence with a ; that isn't expected, so both Sequence and Value fail the parse eventually. This isn't very clear unless you #define BOOST_SPIRIT_DEBUG¹ and inspect the debug output.

So to fix it use:

Sequence = Pair % char_(";&") >> -omit[char_(";&")];

Fix Live On Coliru (or with debug info)

Prints:

Key1|Value1
Key2|01-Jan-2015
Key3|2.7181
Key4|Johnny
=====
Key1|Value1
Key2|InnerK1=one;IK2=11-Nov-2011;
=====
K1|V1
K2|IK1=IV1;IK2=IV2;
K3|V3
K4|JK1=JV1;JK2=JV2;

Bonus Cleanup

Actually, that was simple. Just remove the redundant line removing whitespace. The skipper was already qi::space.

(Note though that the skipper doesn't apply to your Value rule, so values cannot contain whitespace but the parser will not silently skip it either; I suppose this is likely what you want. Just be aware of it).

Recursive AST

You would actually want to have a recursive AST, instead of parsing into a flat map.

Boost recursive variants make this a breeze:

namespace ast {
    typedef boost::make_recursive_variant<std::string, std::map<std::string, boost::recursive_variant_> >::type Value;
    typedef std::map<std::string, Value> Sequence;
}

To make this work you just change the declared attribute types of the rules:

qi::rule<It, ast::Sequence(),                      Skipper> Sequence;
qi::rule<It, std::pair<std::string, ast::Value>(), Skipper> Pair;
qi::rule<It, std::string(),                        Skipper> String;
qi::rule<It, std::string()>                                 KeyName;

The rules themselves don't even have to change at all. You will need to write a little visitor to stream the AST:

static inline std::ostream& operator<<(std::ostream& os, ast::Value const& value) {
    struct vis : boost::static_visitor<> {
        vis(std::ostream& os, std::string indent = "") : _os(os), _indent(indent) {}

        void operator()(std::map<std::string, ast::Value> const& map) const {
            _os << "map {\n";
            for (auto& entry : map) {
                _os << _indent << "    " << entry.first << '|';
                boost::apply_visitor(vis(_os, _indent+"    "), entry.second);
                _os << "\n";
            }
            _os << _indent << "}\n";
        }
        void operator()(std::string const& s) const {
            _os << s;
        }

    private:
        std::ostream& _os;
        std::string _indent;
    };
    boost::apply_visitor(vis(os), value);
    return os;
}

Now it prints:

map {
    Key1|Value1
    Key2|01-Jan-2015
    Key3|2.7181
    Key4|Johnny
}

=====
map {
    Key1|Value1
    Key2|InnerK1 = one; IK2 = 11-Nov-2011;
}

=====
map {
    K1|V1
    K2|IK1=IV1; IK2=IV2;
    K3|V3
    K4|JK1=JV1; JK2=JV2;
}

Of course, the clincher is when you change raw[Sequence] to just Sequence now:

map {
    Key1|Value1
    Key2|01-Jan-2015
    Key3|2.7181
    Key4|Johnny
}

=====
map {
    Key1|Value1
    Key2|map {
        IK2|11-Nov-2011
        InnerK1|one
    }

}

=====
map {
    K1|V1
    K2|map {
        IK1|IV1
        IK2|IV2
    }

    K3|V3
    K4|map {
        JK1|JV1
        JK2|JV2
    }

}

Full Demo Code

Live On Coliru

//#define BOOST_SPIRIT_DEBUG
#include <boost/variant.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/adapted/std_pair.hpp>
#include <iostream>
#include <string>
#include <map>

namespace ast {
    typedef boost::make_recursive_variant<std::string, std::map<std::string, boost::recursive_variant_> >::type Value;
    typedef std::map<std::string, Value> Sequence;
}

namespace qi = boost::spirit::qi;

template <typename It, typename Skipper>
struct NestedGrammar : qi::grammar <It, ast::Sequence(), Skipper>
{
    NestedGrammar() : NestedGrammar::base_type(Sequence)
    {
        using namespace qi;
        KeyName = qi::char_("a-zA-Z_") >> *qi::char_("a-zA-Z0-9_");
        String = +qi::char_("-.a-zA-Z_0-9");

        Pair = KeyName >> -(
                '=' >> ('{' >> Sequence >> '}' | String)
            );

        Sequence = Pair % char_(";&") >> -omit[char_(";&")];

        BOOST_SPIRIT_DEBUG_NODES((KeyName) (String) (Pair) (Sequence))
    }
private:
    qi::rule<It, ast::Sequence(),                      Skipper> Sequence;
    qi::rule<It, std::pair<std::string, ast::Value>(), Skipper> Pair;
    qi::rule<It, std::string(),                        Skipper> String;
    qi::rule<It, std::string()>                                 KeyName;
};


template <typename Iterator>
ast::Sequence DoParse(Iterator begin, Iterator end)
{
    NestedGrammar<Iterator, qi::space_type> p;
    ast::Sequence data;
    qi::phrase_parse(begin, end, p, qi::space, data);
    return data;
}

static inline std::ostream& operator<<(std::ostream& os, ast::Value const& value) {
    struct vis : boost::static_visitor<> {
        vis(std::ostream& os, std::string indent = "") : _os(os), _indent(indent) {}

        void operator()(std::map<std::string, ast::Value> const& map) const {
            _os << "map {\n";
            for (auto& entry : map) {
                _os << _indent << "    " << entry.first << '|';
                boost::apply_visitor(vis(_os, _indent+"    "), entry.second);
                _os << "\n";
            }
            _os << _indent << "}\n";
        }
        void operator()(std::string const& s) const {
            _os << s;
        }

      private:
        std::ostream& _os;
        std::string _indent;
    };
    boost::apply_visitor(vis(os), value);
    return os;
}

int main()
{
    std::string const Example1 = "Key1=Value1 ; Key2 = 01-Jan-2015; Key3 = 2.7181; Key4 = Johnny";
    std::string const Example2 = "Key1 = Value1; Key2 = {InnerK1 = one; IK2 = 11-Nov-2011;};";
    std::string const Example3 = "K1 = V1; K2 = {IK1=IV1; IK2=IV2;}; K3=V3; K4 = {JK1=JV1; JK2=JV2;};";

    std::cout << DoParse(Example1.begin(), Example1.end()) << "\n";
    std::cout << DoParse(Example2.begin(), Example2.end()) << "\n";
    std::cout << DoParse(Example3.begin(), Example3.end()) << "\n";
}

¹ You "had" it, but not in the right place! It should go before any Boost includes.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!