How to parse keywords as normal words some of the time in ANTLR4

故事扮演 提交于 2019-12-13 07:19:28

问题


I have a language with keywords like hello that are only keywords in certain types of sentences. In other types of sentences, these words should be matched as an ID, for example. Here's a super simple grammar that tells the story:

grammar Hello;

file : ( sentence )* ;
sentence : 'hello' ID PERIOD
         | INT ID PERIOD;

ID  : [a-z]+ ;
INT : [0-9]+ ;
WS  : [ \t\r\n]+ -> skip ;
PERIOD : '.' ;

I'd like these sentences to be valid:

hello fred.
31 cheeseburgers.
6 hello.

but that last sentence doesn't work in this grammar. The word hello is a token of type hello and not of type ID. It seems like the lexer grabs all the hellos and turns them into tokens of that type.

Here's a crazy way to do it, to explain what I want:

sentence : 'hello' ID PERIOD
         | INT crazyID PERIOD;

crazyID : ID | 'hello' ;

but in my real language, there are a lot of keywords like hello to deal with, so, yeah, that way seems crazy.

Is there a reasonable, compact, target-language-independent way to handle this?


回答1:


A standard way of handling keywords:

file     : ( sentence )* EOF ;
sentence : key=( KEYWORD | INT ) id=( KEYWORD | ID ) PERIOD ;

KEYWORD : 'hello' | 'goodbye' ; // list others as alts
PERIOD  : '.' ;
ID      : [a-z]+ ;
INT     : [0-9]+ ;
WS      : [ \t\r\n]+ -> skip ;

The seeming ambiguity between the KEYWORD and ID rules is resolved based on the KEYWORD rule being listed before the ID rule.

In the parser SentenceContext, TerminalNode variables key and id will be generated and, on parsing, will effectively hold the matched tokens, allowing easy positional identification.



来源:https://stackoverflow.com/questions/38755469/how-to-parse-keywords-as-normal-words-some-of-the-time-in-antlr4

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!