How to remove ambiguity in the following grammar?

问题

How to remove ambiguity in following grammar?

E -> E * F | F + E | F

F -> F - F | id

回答1:

First, we need to find the ambiguity.

Consider the rules for E without F; change F to f and consider it a terminal symbol. Then the grammar

E -> E * f
E -> f + E
E -> f

is ambiguous. Consider f + f * f:

    E                      E
    |                      |
    +-------+--+           +-+-+
    |       |  |           | | |
    E       *  f           f + E
  +-+-+                        |
  | | |                        +-+-+
  f + E                        E * f
      |                        |
      f                        f

We can resolve this ambiguity by forcing * or + to take precedence. Typically, * takes precedence in the order of operations, but this is totally arbitrary.

E -> f + E | A
A -> A * f | f

Now, the string f + f * f has just one parsing:

    E
    |
    +-+-+
    | | |
    f + E
        |
        A
        |
        +-+-+
        A * f
        |
        f

Now, consider our original grammar which uses F instead of f:

E -> F + E | A
A -> A * F | F
F -> F - F | id

Is this ambiguous? It is. Consider the string id - id - id.

E                    E
|                    |
A                    A
|                    |
F                    F
|                    |
+-----+----+----+    +----+----+----+
      |    |    |         |    |    |
      F    -    F         F    -    F
      |         |         |         |
    +-+-+       id        id      +-+-+
    F - F                         F - F
    |   |                         |   |
    id  id                        id  id

The ambiguity here is that - can be left-associative or right-associative. We can choose the same convention as for +:

E -> F + E | A
A -> A * F | F
F -> id - F | id

Now, we have only one parsing:

E
|
A
|
F
|
+----+----+----+
     |    |    |
     id   -    F
               |
            +--+-+
            |  | |
            id - F
                 |
                 id

Now, is this grammar ambiguous? It is not.

s will have #(+) +s in it, and we always need to use production E -> F + E exactly #(+) times and then production E -> A once.
s will have #(*) *s in it, and we always need to use production A -> A * F exactly #(*) times and then production E -> F once.
s will have #(-) -s in it, and we always need to use production F -> id - F exactly #(-) times and the production F -> id once.

That s has exactly #(+) +s, #(*) *s and #(-) -s can be taken for granted (the numbers can be zero if not present in s). That E -> A, A -> F and F -> id have to be used exactly once can be shown as follows:

If E -> A is never used, any string derived will still have E, a nonterminal, in it, and so will not be a string in the language (nothing is generated without taking E -> A at least once). Also, every string that can be generated before using E -> A has at most one E in it (you start with one E, and the only other production keeps one E) so it is never possible to use E -> A more than once. So E -> A is used exactly once for all derived strings. The demonstration works the same way for A -> F and F -> id.

That E -> F + E, A -> A * F and F -> id - F are used exactly #(+), #(*) and #(-) times, respectively, is apparent from the fact that these are the only productions that introduce their respective symbols and each introduces one instance.

If you consider the sub-grammars of our resulting grammars, we can prove they are unambiguous as follows:

F -> id - F | id

This is an unambiguous grammar for (id - )*id. The only derivation of (id - )^kid is to use F -> id - F k times and then use F -> id exactly once.

A -> A * F | F

We have already seen that F is unambiguous for the language it recognizes. By the same argument, this is an unambiguous grammar for the language F( * F)*. The derivation of F( * F)^k will require the use of A -> A * F exactly k times and then the use of A -> F. Because the language generated from F is unambiguous and because the language for A unambiguously separates instances of F using *, a symbol not generated by F, the grammar

A -> A * F | F
F -> id - F | id

Is also unambiguous. To complete the argument, apply the same logic to the grammar generating (F + )*A from the start symbol E.

来源：https://stackoverflow.com/questions/46544478/how-to-remove-ambiguity-in-the-following-grammar

标签

parsing

grammar