ANTLR4 Lexer error reporting (length of offending characters)

笑着哭i 提交于 2019-12-08 18:50:31

You should write your lexer such that a syntax error is impossible. In ANTLR 4, it is easy to do this by simply adding the following as the last rule of your lexer:

ErrorChar : . ;

By doing this, your errors are moved from the lexer to the parser.

In some cases, you can take additional steps to help users while they edit code in your IDE. For example, suppose your language supports double-quoted strings of the following form, which cannot span multiple lines:

StringLiteral : '"' ~[\r\n"]* '"';

You can improve error reporting in your IDE by using the following pair of rules:

StringLiteral : '"' ~[\r\n"]* '"';
UnterminatedStringLiteral : '"' ~[\r\n"]*;

You can then override the emit() method to treat the UnterminatedStringLiteral in a special way. As a result, the user sees a great error message and the parser sees a single StringLiteral token that it can generally handle well.

@Override
public Token emit() {
    switch (getType()) {
    case UnterminatedStringLiteral:
        setType(StringLiteral);
        Token result = super.emit();
        // you'll need to define this method
        reportError(result, "Unterminated string literal");
        return result;
    default:
        return super.emit();
    }
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!