antlr4 | 易学教程

antlr4: Grammar ambiguity, left-recursion, both?

阅读更多关于 antlr4: Grammar ambiguity, left-recursion, both?

问题 My grammar, shown below, does not compile. The returned error (from the antlr4 maven plugin) is: [INFO] --- antlr4-maven-plugin:4.3:antlr4 (default-cli) @ beebell --- [INFO] ANTLR 4: Processing source directory /Users/kodecharlie/workspace/beebell/src/main/antlr4 [INFO] Processing grammar: DateRange.g4 org\antlr\v4\parse\GrammarTreeVisitor.g: node from line 13:87 mismatched tree node: startTime expecting <UP> org\antlr\v4\parse\GrammarTreeVisitor.g: node from after line 13:87 mismatched tree

Antlr for multiple language generation

阅读更多关于 Antlr for multiple language generation

问题 This post about the antlr simple example shows how to create and us a grammar for java. However, this intermixes the grammar and the Java source code in the Exp.g source. My Question is, Is it possible to decouple the grammar file from the target language, so that the one grammar file can be used for generating multiple Java, Scala, C++, etc Lexers/Parsers? 回答1: It depends mostly on the reason why target code is used in the grammar. Is it only action code to do something with the found tokens

How to parse keywords as normal words some of the time in ANTLR4

阅读更多关于 How to parse keywords as normal words some of the time in ANTLR4

问题 I have a language with keywords like hello that are only keywords in certain types of sentences. In other types of sentences, these words should be matched as an ID, for example. Here's a super simple grammar that tells the story: grammar Hello; file : ( sentence )* ; sentence : 'hello' ID PERIOD | INT ID PERIOD; ID : [a-z]+ ; INT : [0-9]+ ; WS : [ \t\r\n]+ -> skip ; PERIOD : '.' ; I'd like these sentences to be valid: hello fred. 31 cheeseburgers. 6 hello. but that last sentence doesn't work

Flex ‘r/s’ in ANTLv4

阅读更多关于 Flex ‘r/s’ in ANTLv4

问题 Flex: ‘r/s’ an ‘r’ but only if it is followed by an ‘s’. The text matched by ‘s’ is included when determining whether this rule is the longest match, but is then returned to the input before the action is executed. So the action only sees the text matched by ‘r’. This type of pattern is called trailing context. (There are some combinations of ‘r/s’ that flex cannot match correctly. See Limitations, regarding dangerous trailing context.) How do this in ANTLRv4 ? 回答1: There are two primary ways

Lexer, overlapping rule, but want the shorter match

阅读更多关于 Lexer, overlapping rule, but want the shorter match

问题 I want to read an input stream and divide the input into 2 types: PATTERN & WORD_WEIGHT, which are defined below. The problem arises from the fact that all the chars defined for a WORD_WEIGHT are also valid for a PATTERN. When we have multiple WORD_WEIGHTs without spaces between the lexer will match PATTERN rather than deliver multiple WORD_WEIGHTs. I need to be able to handle the following cases and get the indicated result: [20] => WORD_WEIGHT cat => PATTERN [dog] => PATTERN And this case,

Antlr4 is printing 'Extraneous input' error even with expected input

阅读更多关于 Antlr4 is printing 'Extraneous input' error even with expected input

问题 I'm trying to parse SMILES strings using the OpenSMILES specification. The grammar: grammar SMILES; atom: bracket_atom | aliphatic_organic | aromatic_organic | '*'; aliphatic_organic: 'B' | 'C' | 'N' | 'O' | 'S' | 'P' | 'F' | 'Cl' | 'Br' | 'I'; aromatic_organic: 'b' | 'c' | 'n' | 'o' | 's' | 'p'; bracket_atom: '[' isotope? symbol chiral? hcount? charge? atom_class? ']'; symbol: element_symbols | aromatic_symbols | '*'; isotope: NUMBER; element_symbols: UPPER_CASE_CHAR LOWER_CASE_CHAR?;

Storing line number in ANTLR Parse Tree

阅读更多关于 Storing line number in ANTLR Parse Tree

问题 Is there any way of storing line numbers in the created parse tree, using ANTLR 4? I came across this article: http://puredanger.github.io/tech.puredanger.com/2007/02/01/recovering-line-and-column-numbers-in-your-antlr-ast/ ,which does it but i think it's for older ANTLR version, because parser.setASTFactory(factory); does not seem to be applicable for ANTLR 4. I am thinking of having something like treenode.getLine() , just like we can have treenode.getChild() 回答1: With Antlr4, you normally

Lexer to handle lines with line number prefix

阅读更多关于 Lexer to handle lines with line number prefix

问题 I'm writing a parser for a language that looks like the following: L00<<identifier>> L10<<keyword>> L250<<identifier>> <<identifier>> That is, each line may or may not start with a line number of the form Lxxx.. ('L' followed by one or more digits) followed by an identifer or a keyword. Identifiers are standard [a-zA-Z_][a-zA-Z0-9_]* and the number of digits following the L is not fixed. Spaces between the line number and following identifer/keyword are optional (and not present in most cases

ANTLR4: Invoke different sub-parser for specific rule

阅读更多关于 ANTLR4: Invoke different sub-parser for specific rule

问题 Consider this very simplified example where an input of the following form should be matched mykey -> This is the value My real case is much more complex but this will do for showing what I try to achieve. mykey is an ID while on the right side of -> we have a set of Words . If I use grammar Root; parse : ID '->' value ; value : Word+ ; ID : ('a'..'z')+ ; Word : ('a'..'z' | 'A'..'Z' | '0'..'9')+ ; WS : ' ' -> skip ; the example won't be parsed because the lexer will give an ID token for the

ANTLR4 left-recursive error

阅读更多关于 ANTLR4 left-recursive error

问题 My ANTLR4 grammar in file power.g4 is this: assign : id '=' expr ; id : 'A' | 'B' | 'C' ; expr : expr '+' term | expr '-' term | term ; term : term '*' factor | term '/' factor | factor ; factor : expr '**' factor | '(' expr ')' | id ; WS : [ \t\r\n]+ -> skip ; When I run command antlr4 power.g4 This error occurred: error(119): power.g4::: The following sets of rules are mutually left-recursive [expr, factor, term] What can I do? 回答1: To avoid the left recursion error, put all forms of an