Antlr : beginner 's mismatched input expecting ID

会有一股神秘感。 提交于 2019-11-28 02:13:41

问题


As a beginner, when I was learning ANTLR4 from the The Definitive ANTLR 4 Reference book, I tried to run my modified version of the exercise from Chapter 7:

/**
 * to parse properties file
 * this example demonstrates using embedded actions in code
 */
grammar PropFile;

@header  {
    import java.util.Properties;
}
@members {
    Properties props = new Properties();
}
file
    : 
    {
        System.out.println("Loading file...");
    }
        prop+
    {
        System.out.println("finished:\n"+props);
    }
    ;

prop
    : ID '=' STRING NEWLINE 
    {
        props.setProperty($ID.getText(),$STRING.getText());//add one property
    }
    ;

ID  : [a-zA-Z]+ ;
STRING  :(~[\r\n])+; //if use  STRING : '"' .*? '"'  everything is fine
NEWLINE :   '\r'?'\n' ;

Since Java properties are just key-value pair I use STRING to match eveything except NEWLINE (I don't want it to just support strings in the double-quotes). When running following sentence, I got:

D:\Antlr\Ex\PropFile\Prop1>grun PropFile prop -tokens
driver=mysql
^Z
[@0,0:11='driver=mysql',<3>,1:0]
[@1,12:13='\r\n',<4>,1:12]
[@2,14:13='<EOF>',<-1>,2:14]
line 1:0 mismatched input 'driver=mysql' expecting ID

When I use STRING : '"' .*? '"' instead, it works.

I would like to know where I was wrong so that I can avoid similar mistakes in the future.

Please give me some suggestion, thank you!


回答1:


Since both ID and STRING can match the input text starting with "driver", the lexer will choose the longest possible match, even though the ID rule comes first.

So, you have several choices here. The most direct is to remove the ambiguity between ID and STRING (which is how your alternative works) by requiring the string to start with the equals sign.

file : prop+ EOF ;
prop : ID STRING NEWLINE ;

ID      : [a-zA-Z]+ ;
STRING  : '=' (~[\r\n])+;
NEWLINE : '\r'?'\n' ;

You can then use an action to trim the equals sign from the text of the string token.

Alternately, you can use a predicate to disambiguate the rules.

file : prop+ EOF ;
prop : ID '=' STRING NEWLINE ;

ID      : [a-zA-Z]+ ;
STRING  : { isValue() }? (~[\r\n])+; 
NEWLINE : '\r'?'\n' ;

where the isValue method looks backwards on the character stream to verify that it follows an equals sign. Something like:

@members {
public boolean isValue() {
    int offset = _tokenStartCharIndex;
    for (int idx = offset-1; idx >=0; idx--) {
        String s = _input.getText(Interval.of(idx, idx));
        if (Character.isWhitespace(s.charAt(0))) {
            continue;
        } else if (s.charAt(0) == '=') {
            return true;
        } else {
            break;
        }
    }
    return false;
}
}


来源:https://stackoverflow.com/questions/23318705/antlr-beginner-s-mismatched-input-expecting-id

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!