How to create a lexical analyzer in ANTLR 4 that can catch different types of lexical errors

问题

I am using ANTLR 4 to create my lexer, but I don't how to create a lexical analyzer that catches different types of lexical errors.

For example:

If I have an unrecognized symbol like ^ the lexical analyzer should a report an error like this "Unrecognized symbol "^" "
If I have an invalid identifier like 2n the lexical analyzer should report an error like this "identifier "2n" must begin with a letter"

Please can you help me.

回答1:

Create an error token rule for each known error and an "catchall" error token rule at the end like this:

// valid tokens first!
Number : [0-9]+;
Identifier : [a-zA-Z] [a-zA-Z0-9]*;
//...

// "error" tokens
// don't use these tokens in your grammar; They will show up as extraneous tokens during parsing and can be handled if desired.
InvalidIdentifier : [0-9]([0-9a-zA-Z])+; 
ACommonInvalidToken : '^'; // if you want to be more specific for certain cases
// add more to address common mistakes

UnknownToken : . ; // the "catch-all" error token; be sure not to be too greedy...

来源：https://stackoverflow.com/questions/28678232/how-to-create-a-lexical-analyzer-in-antlr-4-that-can-catch-different-types-of-le

标签

java

compiler-construction

antlr

antlr4

lexical-analysis

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!