A yacc shift/reduce conflict on an unambiguous grammar

大憨熊 提交于 2019-11-28 13:02:37

Bison provides you with at least a hint. In State 9, which is really the only relevant part of the output other than the grammar itself, we see:

State 9

    3 InstanciaFuncion: PuntoEntrada Sentencias .
    5 Sentencias: Sentencias . Sentencia

    ID  shift, and go to state 8

    ID        [reduce using rule 3 (InstanciaFuncion)]
    $default  reduce using rule 3 (InstanciaFuncion)

    Sentencia  go to state 12

There's a shift/reduce conflict with ID, in the context in which the possibilities are:

  1. Complete the parse of an InstanciaFuncion (reduce)

  2. Continue the parse of a Sentencias (shift)

In both of those contexts, an ID is possible. It's easy to construct an example. Consider these two instancias:

f : a = b c = d ...
f : a = b c : d = ...

We've finished with the b and c is the lookahead, so we can't see the symbol which follows the c. Now, have we finished parsing the funcion f? Or should we try for a longer list of sentencias? No se sabe. (Nobody knows.)

Yes, your grammar is unambiguous, so it doesn't need to be disambiguated. It's not LR(1), though: you cannot tell what to do by only looking at the next one symbol. However, it is LR(2), and there is a proof than any LR(2) grammar has a corresponding LR(1) grammar. (For any value of 2 :) ). But, unfortunately, actually doing the transformation is not always very pretty. It can be done mechanically, but the resulting grammar can be hard to read. (See Notes below for references.)

In your case, it's pretty easy to find an equivalent grammar, but the parse tree will need to be adjusted. Here's one example:

InstanciasFuncion : PuntoEntrada 
                  | InstanciasFuncion PuntoEntrada
                  | InstanciasFuncion Sentencia

PuntoEntrada: ID ':' Sentencia

Sentencia : ID '=' ID

It's a curious fact that this precise shift/reduce conflict is a feature of the grammar of bison itself, since bison accepts grammars as written above (i.e. without semi-colons). Posix insists that yacc do so, and bison tries to emulate yacc. Bison itself solves this problem in the scanner, not in the grammar: it's scanner recognizes "ID :" as a single token (even if separated with arbitrary whitespace). That might also be your best bet.


There is an excellent description of the proof than any LR(k) grammar can be covered by an LR(1) grammar, including the construction technique and a brief description of how to recover the original parse tree, in Sippu & Soisalon-Soininen, Parsing Theory, Vol. II (Springer Verlag, 1990) (Amazon). This two-volume set is a great reference for theoreticians, and has a lot of valuable practical information, but its heavy reading and its also a serious investment. If you have a university library handy, there should be a copy of it available. The algorithm presented is due to MD Mickunas, and was published in 1976 in JACM 23:17-30 (paywalled), which you should also be able to find in a good university library. Failing that, I found a very abbreviated description in Richard Marion Schell's thesis.

Personally, I wouldn't bother with all that, though. Either use a GLR parser, or use the same trick bison uses for the same purpose. Or use the simple grammar in the answer above and fiddle with the AST afterwards; it's not really difficult.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!