Shift/Reduce conflict on C variation grammar

给你一囗甜甜゛ 提交于 2019-12-12 05:25:26

问题


I am writing a parser to a C-like grammar, but I am having a problem with a shift/reduce conflict:

Basically, the grammar accept a list of optional global variables declarations followed by the functions.

I have the following rules:

program: global_list function_list;

type_name : TKINT /* int */
          | TKFLOAT /* float */
          | TKCHAR /* char */

global_list : global_list var_decl ';'
            |
            ;

var_decl : type_name NAME;

function_list : function_list function_def
              |
              ;

function_def : type_name NAME '(' param_list ')' '{' func_body '}' ;

I understand that I have a problem because the grammar can't decide if the next type_name NAME belongs to global_list or function_list, and by default it is expecting a global_list

Ex:

int var1;

int foo(){}

error: unexpcted '(', expecting ';'

回答1:


The problem is that a function_def can only occur after a function_list, which means that the parser needs to reduce an empty function_list (using the production function_list → ε) before it can recognize a function_def. Furthermore, it needs to make that decision by only looking at the token which follows the empty production. Since that token (a type_name) could start either a var_decl or a function_def, there is no way for the parser to decide.

Even leaving the decision for one more token won't help; it's not until the third token that the correct decision can be made. So your grammar is not ambiguous, but it is LR(3).

Sequences of possibly empty lists of different type always create this problem. By contrast, sequences of non-empty lists do not, so a first approach to solving the problem is to eliminate the ε-productions.

First, we expand the top-level definition to make it clear that both lists are optional:

program: global_list function_list;
       | global_list
       | function_list
       |
       ;

Then we make both list types non-empty:

global_list
       : var_decl
       | global_list var_decl
       ;

function_list
       : function_def
       | function_list function_def
       ;

The rest of the grammar is unchanged.

type_name : TKINT /* int */
          | TKFLOAT /* float */
          | TKCHAR /* char */

var_decl : type_name NAME;

function_def : type_name NAME '(' param_list ')' '{' func_body '}' ;

It's worth noting that the problem would never have arisen if declarations could be interspersed. Is it really necessary that all global variables be defined before any function? If not, you could just use a single list type, which would also be conflict free:

program: decl_list ;

decl_list:
         | decl_list var_decl;
         | decl_list function_def
         ;

Both these solutions work because a bottom-up parser can wait until the end of the production being reduced in order to decide which is the correct reduction; it does not matter that var_decl and function_def look identical until the third token.

The problem really is that it's hard to figure out the type of nothing.



来源:https://stackoverflow.com/questions/28665795/shift-reduce-conflict-on-c-variation-grammar

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!