Understanding the List Operator (%) in Boost.Spirit

我怕爱的太早我们不能终老 提交于 2019-11-26 14:38:12

问题


Can you help me understand the difference between the a % b parser and its expanded a >> *(b >> a) form in Boost.Spirit? Even though the reference manual states that they are equivalent,

The list operator, a % b, is a binary operator that matches a list of one or more repetitions of a separated by occurrences of b. This is equivalent to a >> *(b >> a).

the following program produces different results depending on which is used:

#include <iostream>
#include <string>
#include <vector>

#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/include/qi.hpp>

struct Record {
  int id;
  std::vector<int> values;
};

BOOST_FUSION_ADAPT_STRUCT(Record,
  (int, id)
  (std::vector<int>, values)
)

int main() {
  namespace qi = boost::spirit::qi;

  const auto str = std::string{"1: 2, 3, 4"};

  const auto rule1 = qi::int_ >> ':' >> (qi::int_ % ',')                 >> qi::eoi;
  const auto rule2 = qi::int_ >> ':' >> (qi::int_ >> *(',' >> qi::int_)) >> qi::eoi;

  Record record1;
  if (qi::phrase_parse(str.begin(), str.end(), rule1, qi::space, record1)) {
    std::cout << record1.id << ": ";
    for (const auto& value : record1.values) { std::cout << value << ", "; }
    std::cout << '\n';
  } else {
    std::cerr << "syntax error\n";
  }

  Record record2;
  if (qi::phrase_parse(str.begin(), str.end(), rule2, qi::space, record2)) {
    std::cout << record2.id << ": ";
    for (const auto& value : record2.values) { std::cout << value << ", "; }
    std::cout << '\n';
  } else {
    std::cerr << "syntax error\n";
  }
}

Live on Coliru

1: 2, 3, 4, 
1: 2, 

rule1 and rule2 are different only in that rule1 uses the list operator ((qi::int_ % ',')) and rule2 uses its expanded form ((qi::int_ >> *(',' >> qi::int_))). However, rule1 produced 1: 2, 3, 4, (as expected) and rule2 produced 1: 2,. I cannot understand the result of rule2: 1) why is it different from that of rule1 and 2) why were 3 and 4 not included in record2.values even though phrase_parse returned true somehow?


回答1:


Update X3 version added

First off, you fallen into a deep trap here:

Qi rules don't work with auto. Use qi::copy or just used qi::rule<>. Your program has undefined behaviour and indeed it crashed for me (valgrind pointed out where the dangling references originated).

So, first off:

const auto rule = qi::copy(qi::int_ >> ':' >> (qi::int_ % ',')                 >> qi::eoi); 

Now, when you delete the redundancy in the program, you get:

Reproducing the problem

Live On Coliru

int main() {
    test(qi::copy(qi::int_ >> ':' >> (qi::int_ % ',')));
    test(qi::copy(qi::int_ >> ':' >> (qi::int_ >> *(',' >> qi::int_))));
}

Printing

1: 2, 3, 4, 
1: 2, 

The cause and the fix

What happened to 3, 4 which was successfully parsed?

Well, the attribute propagation rules indicate that qi::int_ >> *(',' >> qi::int_) exposes a tuple<int, vector<int> >. In a bid to magically DoTheRightThing(TM) Spirit accidentally misfires and "assigngs" the int into the attribute reference, ignoring the remaining vector<int>.

If you want to make container attributes parse as "an atomic group", use qi::as<>:

test(qi::copy(qi::int_ >> ':' >> qi::as<Record::values_t>() [ qi::int_ >> *(',' >> qi::int_)]));

Here as<> acts as a barrier for the attribute compatibility heuristics and the grammar knows what you meant:

Live On Coliru

#include <iostream>
#include <string>
#include <vector>

#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/include/qi.hpp>

struct Record {
  int id;
  using values_t = std::vector<int>;
  values_t values;
};

BOOST_FUSION_ADAPT_STRUCT(Record, id, values)

namespace qi = boost::spirit::qi;

template <typename T>
void test(T const& rule) {
    const std::string str = "1: 2, 3, 4";

    Record record;

    if (qi::phrase_parse(str.begin(), str.end(), rule >> qi::eoi, qi::space, record)) {
        std::cout << record.id << ": ";
        for (const auto& value : record.values) { std::cout << value << ", "; }
        std::cout << '\n';
    } else {
        std::cerr << "syntax error\n";
    }
}

int main() {
    test(qi::copy(qi::int_ >> ':' >> (qi::int_ % ',')));
    test(qi::copy(qi::int_ >> ':' >> (qi::int_ >> *(',' >> qi::int_))));
    test(qi::copy(qi::int_ >> ':' >> qi::as<Record::values_t>() [ qi::int_ >> *(',' >> qi::int_)]));
}

Prints

1: 2, 3, 4, 
1: 2, 
1: 2, 3, 4, 



回答2:


Because it's time to get people started with X3 (the new version of Spirit), and because I like to challenge msyelf to do the corresponding tasks in Spirit X3, here is the Spirit X3 version.

There's no problem with auto in X3.

The "broken" case also behaves much better, triggering this static assertion:

    // If you got an error here, then you are trying to pass
    // a fusion sequence with the wrong number of elements
    // as that expected by the (sequence) parser.
    static_assert(
        fusion::result_of::size<Attribute>::value == (l_size + r_size)
      , "Attribute does not have the expected size."
    );

That's nice, right?

The workaround seems a bit less readable:

test(int_ >> ':' >> (rule<struct _, Record::values_t>{} = (int_ >> *(',' >> int_))));

But it would be trivial to write your own as<> "directive" (or just a function), if you wanted:

namespace {
    template <typename T>
    struct as_type {
        template <typename Expr>
            auto operator[](Expr&& expr) const {
                return x3::rule<struct _, T>{"as"} = x3::as_parser(std::forward<Expr>(expr));
            }
    };

    template <typename T> static const as_type<T> as = {};
}

DEMO

Live On Coliru

#include <iostream>
#include <string>
#include <vector>

#include <boost/fusion/adapted/std_tuple.hpp>
#include <boost/spirit/home/x3.hpp>

struct Record {
    int id;
    using values_t = std::vector<int>;
    values_t values;
};

namespace x3 = boost::spirit::x3;

template <typename T>
void test(T const& rule) {
    const std::string str = "1: 2, 3, 4";

    Record record;

    auto attr = std::tie(record.id, record.values);

    if (x3::phrase_parse(str.begin(), str.end(), rule >> x3::eoi, x3::space, attr)) {
        std::cout << record.id << ": ";
        for (const auto& value : record.values) { std::cout << value << ", "; }
        std::cout << '\n';
    } else {
        std::cerr << "syntax error\n";
    }
}

namespace {
    template <typename T>
    struct as_type {
        template <typename Expr>
            auto operator[](Expr&& expr) const {
                return x3::rule<struct _, T>{"as"} = x3::as_parser(std::forward<Expr>(expr));
            }
    };

    template <typename T> static const as_type<T> as = {};
}

int main() {
    using namespace x3;
    test(int_ >> ':' >> (int_ % ','));
    //test(int_ >> ':' >> (int_ >> *(',' >> int_))); // COMPILER asserts "Attribute does not have the expected size."

    // "clumsy" x3 style workaround
    test(int_ >> ':' >> (rule<struct _, Record::values_t>{} = (int_ >> *(',' >> int_))));

    // using an ad-hoc `as<>` implementation:
    test(int_ >> ':' >> as<Record::values_t>[int_ >> *(',' >> int_)]);
}

Prints

1: 2, 3, 4, 
1: 2, 3, 4, 
1: 2, 3, 4, 


来源:https://stackoverflow.com/questions/33816662/understanding-the-list-operator-in-boost-spirit

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!