How to create a lexical analyzer in ANTLR 4 that can catch different types of lexical errors

狂风中的少年 提交于 2019-12-11 18:08:22

问题


I am using ANTLR 4 to create my lexer, but I don't how to create a lexical analyzer that catches different types of lexical errors.

For example:

  1. If I have an unrecognized symbol like ^ the lexical analyzer should a report an error like this "Unrecognized symbol "^" "

  2. If I have an invalid identifier like 2n the lexical analyzer should report an error like this "identifier "2n" must begin with a letter"

Please can you help me.


回答1:


Create an error token rule for each known error and an "catchall" error token rule at the end like this:

// valid tokens first!
Number : [0-9]+;
Identifier : [a-zA-Z] [a-zA-Z0-9]*;
//...

// "error" tokens
// don't use these tokens in your grammar; They will show up as extraneous tokens during parsing and can be handled if desired.
InvalidIdentifier : [0-9]([0-9a-zA-Z])+; 
ACommonInvalidToken : '^'; // if you want to be more specific for certain cases
// add more to address common mistakes

UnknownToken : . ; // the "catch-all" error token; be sure not to be too greedy...


来源:https://stackoverflow.com/questions/28678232/how-to-create-a-lexical-analyzer-in-antlr-4-that-can-catch-different-types-of-le

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!