Range quantifier syntax in ANTLR Regex

烂漫一生 提交于 2019-12-23 08:02:02

问题


This should be fairly simple. I'm working on a lexer grammar using ANTLR, and want to limit the maximum length of variable identifiers to 30 characters. I attempted to accomplish this with this line(following normal regex - except for the '' thing - syntax):

ID  :   ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9'|'_'){0,29}  {System.out.println("IDENTIFIER FOUND.");}
    ;

No errors in code generation, but compilation failed due to a line in the generated code that was simply:

0,29

Obviously antlr is taking the section of text between the brackets and placing it in the accept state area along with the print line. I searched the ANTLR site, and I found no example or reference to an equivalent expression. What should the syntax of this expression be?


回答1:


ANTLR does not support the {m,n} quantifier syntax. ANTLR sees the {} of your quantifier and can't tell them apart from the {} that surround your actions.

Workarounds:

  1. Enforce the limit semantically. Let it gather an unlimited size ID and then complain/truncate it as part of your action code or later in the compiler.
  2. Create the quantification rules manually.

This is an example of a manual rule that limits IDs to 8.

SUBID : ('a'..'z'|'A'..'Z'|'0'..'9'|'_')
      ;
ID : ('a'..'z'|'A'..'Z')
     (SUBID (SUBID (SUBID (SUBID (SUBID (SUBID SUBID?)?)?)?)?)?)?
   ;

Personally, I'd go with the semantic solution (#1). There is very little reason these days to limit the identifiers in a language, and even less reason to cause a syntax error (early abort of the compile) when such a rule is violated.



来源:https://stackoverflow.com/questions/12188827/range-quantifier-syntax-in-antlr-regex

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!