antlr4 | 易学教程

Can't import module ANTLR MyGrammarLexer and MyGrammarParser

阅读更多关于 Can't import module ANTLR MyGrammarLexer and MyGrammarParser

问题 I'm trying to start with ANTLR . When I import module antlr it's working just fine , but if I try to import MyGrammarLexer and MyGrammarParser , it's shows that MyGrammarLexer and Parser aren't in lib. I Using PyCharm , I installed ANTLR with : pip3 install antlr4-python3-runtime my code is : import sys from antlr4 import * import MyGrammarLexer import MyGrammarParser def main(argv): input = FileStream(argv[1]) lexer = MyGrammarLexer(input) stream = CommonTokenStream(lexer) parser =

how to report grammar ambiguity in antlr4

阅读更多关于 how to report grammar ambiguity in antlr4

问题 According to the antlr4 book (page 159), and using the grammar Ambig.g4, grammar ambiguity can be reported by: grun Ambig stat -diagnostics or equivalently, in code form: parser.removeErrorListeners(); parser.addErrorListener(new DiagnosticErrorListener()); parser.getInterpreter().setPredictionMode(PredictionMode.LL_EXACT_AMBIG_DETECTION); The grun command reports the ambiguity properly for me, using antlr-4.5.3 . But when I use the code form, I dont get the ambiguity report. Here is the

Getting plain text in antlr instead of tokens

阅读更多关于 Getting plain text in antlr instead of tokens

问题 I'm trying to create a parser using antlr. My grammar is as follows. code : codeBlock* EOF; codeBlock : text | tag1Ops | tag2Ops ; tag1Ops: START_1_TAG ID END_2_TAG ; tag2Ops: START_2_TAG ID END_2_TAG ; text: ~(START_1_TAG|START_2_TAG)+; START_1_TAG : '<%' ; END_1_TAG : '%>' ; START_2_TAG : '<<'; END_2_TAG : '>>' ; ID : [A-Za-z_][A-Za-z0-9_]*; INT_NUMBER: [0-9]+; WS : ( ' ' | '\n' | '\r' | '\t')+ -> channel(HIDDEN); SPACES: SPACE+; ANY_CHAR : .; fragment SPACE : ' ' | '\r' | '\n' | '\t' ;

check previous/left token in lexer

阅读更多关于 check previous/left token in lexer

问题 how can I find the previous/left token in lexer for example lexer grammar TLexer; ID : [a-zA-Z] [a-zA-Z0-9]*; CARET : '^'; RTN : {someCond1}? CARET ID; // CARET not include this token GLB : {someCond2}? CARET ID; // CARET not include this token etc 回答1: thanks, I did it this way lexer grammar TLexer; @lexer::members { int lastTokenType = 0; public void emit(Token token) { super.emit(token); lastTokenType = token.getType(); } } CARET : '^'; RTN : {someCond1&&(lastTokenType==CARET)}? ID; GLB :

Antlr4 doesn't correctly recognizes unicode characters

阅读更多关于 Antlr4 doesn't correctly recognizes unicode characters

问题 I've very simple grammar which tries to match 'é' to token E_CODE. I've tested it using TestRig tool (with -tokens option), but parser can't correctly match it. My input file was encoded in UTF-8 without BOM and I've used ANTLR version 4.4. Could somebody else also check this ? I got this output on my console: line 1:0 token recognition error at: 'Ă' grammar Unicode; stat:EOF; E_CODE: '\u00E9' | 'é'; 回答1: I tested the grammar: grammar Unicode; stat: E_CODE* EOF; E_CODE: '\u00E9' | 'é'; as

ANTLR4 not reporting ambiguity

阅读更多关于 ANTLR4 not reporting ambiguity

问题 Given the following grammar: grammar ReportAmbiguity; unit : statements+; statements : callStatement+ // '.' // <- uncomment this line ; callStatement : 'CALL' ID (argsByRef | argsByVal)*; argsByRef : ('BY' 'REF')? ID+; argsByVal : 'BY' 'VAL' ID+; ID : ('A'..'Z')+; WS : (' '|'\n')+ -> channel(HIDDEN); When parsing the string "CALL FUNCTION BY VAL A B" through the non-root rule callStatement everything works and the parser correctly reports an ambiguity: line 1:24 reportAttemptingFullContext d

Return the line number of the last character for current token

阅读更多关于 Return the line number of the last character for current token

问题 Is there a way in ANTLR 4 to be able to return the line number of the the last character for the current token ? I referred Antlr, get last line from token but that would be specific to a rule. I wanted something more generic but couldn't find what would suit me in the ANTLR API. 回答1: There is no direct way to get this information. However, if you don't have any -> skip commands in your lexer you can derive it from the following token. Suppose token b follows token a . If b

antlr4 array implementation : getting values of elements

阅读更多关于 antlr4 array implementation : getting values of elements

问题 I'm trying to implement arrays in antlr4 and I'm lost as to how I can get the multiple elements of the array when it is initialized like so: int array[] = {1, 2}; I was thinking of placing them in a HashMap like this, the key being the index: public Map<Integer, Value> array_memory = new HashMap<Integer, Value>(); Below is the grammar I'm following: grammar GaleugParserNew; /* * PARSER RULES */ declare_var : INTEGER ID '[' (INT)? ']' (ASSIGN '{' array_init '}')? SCOL ; array_init : INT ','

Why does ANTLR require all or none alternatives be labeled?

阅读更多关于 Why does ANTLR require all or none alternatives be labeled?

问题 I'm new to ANTLR. I just discovered that it is possible to label each alternative in a production like so: foo : a # aLabel | b # bLabel | // ... ; However, I find it unpleasant that all or none alternatives must be labeled. I needed to label just 2 alternatives of a production with 20+ branches recently, and I ended up labelling each of the others # stubLabel . Is there any reason why all or none have to be labeled? 回答1: As soon as you add a label ANTLR4 will no longer generate a context

What to use in ANTLR4 to resolve ambiguities in more complex cases (instead of syntactic predicates)?

阅读更多关于 What to use in ANTLR4 to resolve ambiguities in more complex cases (instead of syntactic predicates)?

问题 In ANTLR v3, syntactic predicates could be used to solve ambiguitites, i.e., to explicitly tell ANTLR which alternative should be chosen. ANTLR4 seems to simply accept grammars with similar ambiguities, but during parsing it reports these ambiguities. It produces a parse tree, despite these ambiguities (by chosing the first alternative, according to the documentation). But what can I do, if I want it to chose some other alternative? In other words, how can I explicitly resolve ambiguities?