antlr4 | 易学教程

ANTLR 4 extraneous input matching non lexer item

阅读更多关于 ANTLR 4 extraneous input matching non lexer item

问题 I have a grammar like this : grammar MyGrammar; field : f1 (STROKE f2 f3)? ; f1 : FIELDTEXT+ ; f2 : 'A' ; f3 : NUMBER4 ; FIELDTEXT : ~['/'] ; NUMBER4 : [0-9][0-9][0-9][0-9]; STROKE : '/' ; This works well enough, and fields f1 f2 f3 are all populated correctly. Except when there is an A to the left of the / , (regardless of the presence of the optional part) this additionally causes an error: extraneous input 'A' expecting {<EOF>, FIELDTEXT, '/'} Some sample Data: PHOEN -> OK. KLM405/A4046 ->

Is there any good ways to improve the parser's performance generated using antlr4?

阅读更多关于 Is there any good ways to improve the parser's performance generated using antlr4?

问题 I have tried a few days to fix my grammar file(uniformSQL.g4) in order to improve the parser performance but still failed. The parser cost 4000+ ms to parser through the SQL case. And I also tried to use SLL(*) strategy, it is fast but come out a lot of mismatch cases. So I wonder how to get the best performance when designing the grammar. I also tried to lower the parse tree'height when designing grammar, but the speed turned out to be slower. Looking forward to your suggestion,thanks. This

Antlr4: single quote rule fails when there are escape chars plus carriage return, new line

阅读更多关于 Antlr4: single quote rule fails when there are escape chars plus carriage return, new line

问题 I have a grammar as such: grammar Testquote; program : (Line ';')+ ; Line: L_S_STRING ; L_S_STRING : '\'' (('\'' '\'') | ('\\' '\'') | ~('\''))* '\''; // Single quoted string literal L_WS : L_BLANK+ -> skip ; // Whitespace fragment L_BLANK : (' ' | '\t' | '\r' | '\n') ; This grammar--and the L_S_STRING in particular--seems working fine with vanilla inputs like: 'ab'; 'cd'; However, it fails with this input: 'yyyy-MM-dd\\'T\\'HH:mm:ss\\'Z\\''; 'cd'; Yet works when I changed the first line to

ANTLR parser for alpha numeric words which may have whitespace in between

阅读更多关于 ANTLR parser for alpha numeric words which may have whitespace in between

问题 First I tried to identify a normal word and below works fine: grammar Test; myToken: WORD; WORD: (LOWERCASE | UPPERCASE )+ ; fragment LOWERCASE : [a-z] ; fragment UPPERCASE : [A-Z] ; fragment DIGIT: '0'..'9' ; WHITESPACE : (' ' | '\t')+; Just when I added below parser rule just beneath "myToken", even my WORD tokens weren't getting recognised with input string as "abc" ALPHA_NUMERIC_WS: ( WORD | DIGIT | WHITESPACE)+; Does anyone have any idea why is that? 回答1: This is because ANTLR's lexer

Antlr4 import of combined grammar failing

阅读更多关于 Antlr4 import of combined grammar failing

问题 I am presently getting... error(56): AqlCommentTest.g4:12:4: reference to undefined rule: htmlCommentDeclaration error(56): AqlCommentTest.g4:13:4: reference to undefined rule: mdCommentDeclaration The import for the lexer grammar does seem to be loading. The following files present the problem. AqlCommentTest.g4 grammar AqlCommentTest; import AqlLexerRules; import AqlComment; program: commentDeclaration+; commentDeclaration: htmlCommentDeclaration #Comment_HTML | mdCommentDeclaration

Antlr4, How to report specific syntax error

阅读更多关于 Antlr4, How to report specific syntax error

问题 I am trying to use antlr4 to write some error checking for my simple grammar. The grammar itself is constructed by functions. ie FUNCTION hello (n){ ...... } FUNCTION main (n) { ...... } I am not sure how it suppose to catch specific errors such as missing function name , or missing main function Here is what my ErrorListener looks like import org.antlr.v4.runtime.*; import org.antlr.v4.runtime.tree.*; public class SimpleErrorListener extends BaseErrorListener { @Override public void

How do I use custom tokens and contexts in ANTLR 4

阅读更多关于 How do I use custom tokens and contexts in ANTLR 4

问题 I've used ANTLR3 for quite a while. I am just switching to ANTLR 4. It is, in general, much more understandable for my students in my compiler class. However, it's not clear from the book and other documentation that I've located, how to make the tokens and contexts that form the nodes of the parse tree customized classes. With ANTLR 3 I just used the options to have the generated code rename them in the generated code. What about in ANTLR 4?Is there documentation that I shoudl have been able

antlr4 mixed fragments in tokens

阅读更多关于 antlr4 mixed fragments in tokens

问题 I observe a strange behavior, trying to parse a text using a grammar that contains a statements like the following: fragment A : ('a'|'A') ; fragment D : ('d'|'D') ; fragment N : ('n'|'N') ; KEY_AND : A N D; I created a simple grammar to produce the issue I experience: grammar AndTest; mainRule: NAME SEP KEY_AND SEP NAME; NAME: ('A'..'Z')+ ; SEP: ';' ; fragment A : ('a'|'A') ; fragment D : ('d'|'D') ; fragment N : ('n'|'N') ; KEY_AND : A N D; WS: [ \r\t\n]+ -> skip ; During grun execution I

Antlr: how to match everything between the other recognized tokens?

阅读更多关于 Antlr: how to match everything between the other recognized tokens?

问题 How do I match all of the leftover text between the other tokens in my lexer? Here's my code: grammar UserQuery; expr: expr AND expr | expr OR expr | NOT expr | TEXT+ | '(' expr ')' ; OR : 'OR'; AND : 'AND'; NOT : 'NOT'; LPAREN : '('; RPAREN : ')'; TEXT: .+?; When I run the lexer on "xx AND yy", I get these tokens: x type:TEXT x type:TEXT type:TEXT AND type:'AND' type:TEXT y type:TEXT y type:TEXT This sort-of works, except that I don't want each character to be a token. I'd like to

How to rewrite Antlr4 Parse Tree manually?

阅读更多关于 How to rewrite Antlr4 Parse Tree manually?

问题 I am working on a simple Xquery processor and using Antlr4 to parse the grammar. I use the visitor pattern to walk through the parse tree. Now I want to rewrite a query if the query meet the some condition. The processor now can process a query if the query directly use the keyword like "join" and meet the "join" grammar. I want to first rewrite the parse tree if the query can be changed to a join query or do nothing if not. Is there a way to manually manipulate the parse tree? Like adding a