Remove Ambiguity in abstract syntax in other to write DCG parser Prolog

前端 未结 2 1578
长情又很酷
长情又很酷 2020-12-19 11:11

P => Program K => Block

S => Single-command

C => Commands

E => Expression

B => Boolean-expr

I => Identifier

N > Numeral

2条回答
  •  生来不讨喜
    2020-12-19 11:42

    @chac already gave you quite a good answer, showing you the usual way to resolve this.

    Let me take another way to read your question: You are "supposed to remove ambiguities in E and B so that" you "can write a DCG parser in Prolog". That means, you need to remove ambiguity only that far that you can write a DCG parser in Prolog. There is good news: You do not need to remove any ambiguities at all to write a DCG parser! Here is how:

    The source of ambiguity are productions like

    C ::= C ; C

    or the other operators + - juxtapositioning div mod and

    Let me stick to a simplified grammar:

    E ::= E + E | "1"

    We could encode this as

    e --> "1".
    e --> e, "+", e.
    

    Unfortunately, Prolog does not terminate for a query like

    ?- L = "1+1+1", phrase(e,L).
    L = "1+1+1" ;
    ERROR: Out of local stack
    

    Actually, it terminates, but only because my computer's memory is finite...

    Not even for:

    ?- L = "1", phrase(e,L).
    L = "1" ;
    ERROR: Out of local stack
    

    Is this a problem of ambiguity? No! It is just a procedural problem of Prolog which cannot handle left-recursions directly. Here is a way to make Prolog handle it:

    e([_|S],S) --> "1".
    e([_|S0],S) --> e(S0,S1), "+", e(S1,S).
    
    ?- L = "1+1+1", phrase(e(L,[]),L).
    L = "1+1+1" ;
    L = "1+1+1" ;
    false.
    
    ?- L = "1", phrase(e(L,[]),L).
    L = "1" ;
    false.
    

    For the moment we have only defined a grammar, most of the times you are also interested to see the corresponding syntax tree:

    e(integer(1), [_|S],S) --> "1".
    e(plus(L,R), [_|S0],S) --> e(L, S0,S1), "+", e(R, S1,S).
    
    ?- L = "1+1+1", phrase(e(Tree, L,[]),L).
    L = "1+1+1",
    Tree = plus(integer(1),plus(integer(1),integer(1))) ;
    L = "1+1+1",
    Tree = plus(plus(integer(1),integer(1)),integer(1)) ;
    false.
    

    Now, we see that there is an ambiguity with plus! Your original grammar both accepted it as (1+1)+1 and 1+(1+1) which itself is not a problem as long as the corresponding semantics guarantees that associativity is observed. Most of the time this is disambiguated to be left-associative, thus meaning (1+1)+1, but this is not the case for all infix operators.

提交回复
热议问题