A piece of code of my gramamar its driveing me crazy.
I have to write a grammar that allow write functions with multiple inputs
e.g.
function
Bison provides you with at least a hint. In State 9, which is really the only relevant part of the output other than the grammar itself, we see:
State 9
3 InstanciaFuncion: PuntoEntrada Sentencias .
5 Sentencias: Sentencias . Sentencia
ID shift, and go to state 8
ID [reduce using rule 3 (InstanciaFuncion)]
$default reduce using rule 3 (InstanciaFuncion)
Sentencia go to state 12
There's a shift/reduce conflict with ID
, in the context in which the possibilities are:
Complete the parse of an InstanciaFuncion
(reduce)
Continue the parse of a Sentencias
(shift)
In both of those contexts, an ID
is possible. It's easy to construct an example. Consider these two instancias:
f : a = b c = d ...
f : a = b c : d = ...
We've finished with the b
and c
is the lookahead, so we can't see the symbol which follows the c. Now, have we finished parsing the funcion f
? Or should we try for a longer list of sentencias? No se sabe. (Nobody knows.)
Yes, your grammar is unambiguous, so it doesn't need to be disambiguated. It's not LR(1), though: you cannot tell what to do by only looking at the next one symbol. However, it is LR(2), and there is a proof than any LR(2) grammar has a corresponding LR(1) grammar. (For any value of 2 :) ). But, unfortunately, actually doing the transformation is not always very pretty. It can be done mechanically, but the resulting grammar can be hard to read. (See Notes below for references.)
In your case, it's pretty easy to find an equivalent grammar, but the parse tree will need to be adjusted. Here's one example:
InstanciasFuncion : PuntoEntrada
| InstanciasFuncion PuntoEntrada
| InstanciasFuncion Sentencia
PuntoEntrada: ID ':' Sentencia
Sentencia : ID '=' ID
It's a curious fact that this precise shift/reduce conflict is a feature of the grammar of bison itself, since bison accepts grammars as written above (i.e. without semi-colons). Posix insists that yacc
do so, and bison tries to emulate yacc
. Bison itself solves this problem in the scanner, not in the grammar: it's scanner recognizes "ID :" as a single token (even if separated with arbitrary whitespace). That might also be your best bet.
There is an excellent description of the proof than any LR(k) grammar can be covered by an LR(1) grammar, including the construction technique and a brief description of how to recover the original parse tree, in Sippu & Soisalon-Soininen, Parsing Theory, Vol. II (Springer Verlag, 1990) (Amazon). This two-volume set is a great reference for theoreticians, and has a lot of valuable practical information, but its heavy reading and its also a serious investment. If you have a university library handy, there should be a copy of it available. The algorithm presented is due to MD Mickunas, and was published in 1976 in JACM 23:17-30 (paywalled), which you should also be able to find in a good university library. Failing that, I found a very abbreviated description in Richard Marion Schell's thesis.
Personally, I wouldn't bother with all that, though. Either use a GLR parser, or use the same trick bison uses for the same purpose. Or use the simple grammar in the answer above and fiddle with the AST afterwards; it's not really difficult.