ANTLR4 lexer rules don't work as expected

前端 未结 1 1211
[愿得一人]
[愿得一人] 2020-12-20 00:52

I want to write a lexer rule about the month and the year, the rule is(with regular expression):

\"hello\"[0-9]{1,2}\"ever\"([0-9]{2}([0-9]{2})?)?

相关标签:
1条回答
  • 2020-12-20 01:26

    You must realise that ANTLR's lexer rules are matched according their position in the grammar file. The lexer does not "listen" what the parser might need at a certain position in a parser rule. The lexer tries to match as much characters as possible, and when 2 (or more) rules match the same amount of characters, the rule defined first will win.

    In your case that means that 15 will always be tokenized as a TimeDate and never as a TimeYear because both rules match 15 but TimeDate is defined first. 2015 will be tokenized as a TimeYear because no other rule matches 4 digits.

    A solution would be to change TimeYear into a parser rule:

    timeYear
     : TimeDate TimeDate?
     ;
    
    0 讨论(0)
提交回复
热议问题