I\'m trying to write a simple parser for an even simpler language that I\'m writing. It\'s composed of postfix expressions. As of now, I\'m having issues with the parser. When I
The problem is that your body rule never terminates, because it's allowed to match nothing. I didn't fire up ANTLR, I really don't like to mess with it, instead I rewrote your grammar in C++ (using AXE parser generator), added print statements to trace the matches and got the following result from parsing "2 2 * test >>":
parsed term: 2
parsed expr: 2
parsed nested: 2
parsed term: 2
parsed expr: 2
parsed nested: 2
parsed body: 2 2
parsed body:
parsed body: ... here goes your infinite loop
If you are interested in debugging this test case, the AXE grammar is shown below, set breakpoints at prints to step through the parser:
using namespace axe;
typedef std::string::iterator It;
auto space = r_any(" \t\n\r");
auto int_rule = r_numstr();
auto id = r_ident();
auto op = r_any("*+/%-");
auto term = int_rule
>> e_ref([](It i1, It i2)
{
std::cout << "\nparsed term: " << std::string(i1, i2);
});
auto expr = (term & *(term & op))
>> e_ref([](It i1, It i2)
{
std::cout << "\nparsed expr: " << std::string(i1, i2);
});
auto nested = (expr & *(expr & op))
>> e_ref([](It i1, It i2)
{
std::cout << "\nparsed nested: " << std::string(i1, i2);
});
auto get = (id & "<<")
>> e_ref([](It i1, It i2)
{
std::cout << "\nparsed get: " << std::string(i1, i2);
});
auto var = (nested & id & ">>")
>> e_ref([](It i1, It i2)
{
std::cout << "\nparsed var: " << std::string(i1, i2);
});
auto body = (*(nested & space) | *(var & space) | *(get & space))
>> e_ref([](It i1, It i2)
{
std::cout << "\nparsed body: " << std::string(i1, i2);
});
auto program = +(body)
| r_fail([](It i1, It i2)
{
std::cout << "\nparsing failed, parsed portion: "
<< std::string(i1, i2);
});
// test parser
std::ostringstream text;
text << "2 2 * test >>";
std::string str = text.str();
program(str.begin(), str.end());