问题
I need to parse a line containing an unsigned int, the character X that is to be discarded, and a string, all separated by one or more spaces. e.g., 1234 X abcd
bool a = qi::phrase_parse(first, last,
uint_[ref(num) = _1] >> lit('X') >> lexeme[+(char_ - ' ')],
space, parsed_str);
The above code parses the three parts, but the string ends up containing a junk character (�abcd) and having a size of 5 and not 4.
What is wrong with my parser? and why is there junk in the string?
回答1:
What you probably haven't realized, is that parser expressions stop having automatic attribute propagation in the presence of semantic actions*.
* Documentation backgound: How Do Rules Propagate Their Attributes?
You're using a semantic action to 'manually' propagate the attribute of the uint_ parser:
[ref(num) = _1] // this is a Semantic Action
So the easiest way to fix this, would be to propagate num automatically too (the way the qi::parse and qi::phrase_parse APIs were intended):
bool ok = qi::phrase_parse(first, last, // input iterators
uint_ >> lit('X') >> lexeme[+(char_ - ' ')], // parser expr
space, // skipper
num, parsed_str); // output attributes
Or, addressing some off-topic points, even cleaner:
bool ok = qi::phrase_parse(first, last,
uint_ >> 'X' >> lexeme[+graph],
blank,
num, parsed_str);
As you can see, you can pass multiple lvalues as output attribute recipients.1, 2
See it a live demo on Coliru (link)
There's a whole lot of magic going on, which in practice leads to my rule of thumb:
Avoid using semantic actions in Spirit Qi expressions unless you absolutely have to
I have about this before, in an answer specificly about this: Boost Spirit: "Semantic actions are evil"?
In my experience, it's almost always cleaner to use the Attribute Customization Points to tweak the automatic propagation than to abandon auto rules and resort to manual attribute handling.
1 What technically happens to propagate these attributes, is that num and parsed_str will be 'tied' to the whole parse expression as a Fusion sequence:
fusion::vector2<unsigned&, std::string&>
and the exposed attribute of the rule:
fusion::vector2<unsigned, std::vector<char> >
will be 'transformed' to that during assignment. The attribute compatibility rules allow this conversion, and many others.
2 Alternatively, use semantic actions for both:
bool ok = qi::phrase_parse(first, last,
(uint_ >> 'X' >> as_string [ lexeme[+graph] ])
[ phx::ref(num) = _1, phx::ref(parsed_str) = _2 ],
blank);
There's a few subtleties here:
we need
as_stringhere to expose the attribute asstd::stringinstead ofstd::vector<char>(see above)we need to qualify
phx::ref(parsed_str)since evenusing boost::phoenix::refwill not be enough to disambiguatestd::refandphx::ref: ADL will drag instd::refsince it is from the same namespace as the type ofparsed_str.group the semantic action to prevent partially assigned results, e.g. the following would overwrite
numeven thoughXmay be missing in the input:bool ok = qi::phrase_parse(first, last, uint_ [ phx::ref(num) = _1 ] >> 'X' >> as_string [ lexeme[+graph] ] [ phx::ref(parsed_str) = _1 ], blank);
All of this complexity can be hidden from your view if you avoid manual attribute propagation!
来源:https://stackoverflow.com/questions/17499442/boost-spirit-parsing-number-char-and-string