antlr | 易学教程

What is wrong with this grammar? (ANTLRWorks 1.4)

阅读更多关于 What is wrong with this grammar? (ANTLRWorks 1.4)

I have the following code written in ANTLRWorks 1.4 grammar hmm; s : (put_a_in_b)|(put_out_a)|(drop_kick)|(drop_a)|(put_on_a); put_a_in_b : (PUT_SYN)(ID)(IN_SYN)(ID); put_out_a : (PUT2_SYN)(OUT_SYN)(ID) | (E1)(ID); drop_kick : ('drop')('kick')(ID); drop_a : (DROP_SYN)(ID); put_on_a : (E2)(ID); PUT_SYN : 'put' | 'place' | 'drop'; PUT2_SYN : 'put' | 'douse'; IN_SYN : 'in' | 'into' | 'inside' | 'within'; OUT_SYN : 'out'; E1 : 'extinguish'|'douse'; DROP_SYN : 'drop' | 'throw' | 'relinquish'; WS : ( ' ' | '\t' | '\r' | '\n' ) {$channel=HIDDEN;}; ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..

Ignore some part of input when parsing with ANTLR

阅读更多关于 Ignore some part of input when parsing with ANTLR

I'm trying to parse a language by ANTLR (ANTLRWorks-3.5.2). The goal is to enter complete input but Antlr gives a parse tree of defined parts in grammar and ignore the rest of inputs, for example this is my grammar : grammar asap; project : '/begin PROJECT' name module+ '/end PROJECT'; module : '/begin MODULE'name '/end MODULE'; name : IDENT ; IDENT : ('a'..'z'|'A'..'Z')('a'..'z'|'A'..'Z'|'0'..'9'|'_'|'.'|':'|'-')*; Given input: /begin PROJECT HybridSailboat_2 /begin MODULE engine /begin A2ML /include XCP_common_v1_0.aml "XCP" struct { taggedstruct Common_Parameters ; }; /end A2ML /end MODULE

ANTLR4: Whitespace handling

阅读更多关于 ANTLR4: Whitespace handling

问题 I have seen many ANTLR grammars that use whitespace handling like this: WS: [ \n\t\r]+ -> skip; // or WS: [ \n\t\r]+ -> channel(HIDDEN); So the whitespaces are thrown away respectively send to the hidden channel. With a grammar like this: grammar Not; start: expression; expression: NOT expression | (TRUE | FALSE); NOT: 'not'; TRUE: 'true'; FALSE: 'false'; WS: [ \n\t\r]+ -> skip; valid inputs are ' not true ' or ' not false ' but also ' nottrue ' which is not a desired result. Changing the

Parse and return a list of doubles using ANTLR4

阅读更多关于 Parse and return a list of doubles using ANTLR4

How can I parse a file containing a decimal numbers into a List<double> in C# using ANTLR4? A complete, working example would illustrate how all the pieces go together. The input file looks like this: 12.34 45.67 89.10 TomServo This is an updated version of an older answer to a different question, showing one way to do this task using C# and ANTLR4. The Grammar grammar Values; parse : (number ( LINEBREAK | EOF ) )* ; number : NUMBER ; NUMBER : DIGIT '.' DIGIT ; DIGIT : [0-9]+ ; WS : [ \t] -> channel(HIDDEN) ; LINEBREAK : '\r'? '\n' | '\r' ; The Listener Now the class that implements the

Interpreting custom language

阅读更多关于 Interpreting custom language

I need to develop an application that will read and understand text file in which I'll find a custom language that describe a list of operations (ie cooking recipe). This language has not been defined yet, but it will probably take one of the following shape : C++ like code (This code is randomly generated, just for example purpose) : begin repeat(10) { bar(toto, 10, 1999, xxx); } result = foo(xxxx, 10); if(foo == ok) { ... } else { ... } end XML code (This code is randomly generated, just for example purpose) : <recipe> <action name="foo" argument"bar, toto, xxx" repeat=10/> <action name="bar

ANTLR - identifier with whitespace

阅读更多关于 ANTLR - identifier with whitespace

问题 i want identifiers that can contain whitespace. grammar WhitespaceInSymbols; premise : ( options {greedy=false;} : 'IF' ) id=ID{ System.out.println($id.text); }; ID : ('a'..'z'|'A'..'Z')+ (' '('a'..'z'|'A'..'Z')+)* ; WS : ' '+ {skip();} ; When i test this with "IF statement analyzed" i get a MissingTokenException and the output "IF statement analyzed". I thought, that by using greedy=false i could tell ANTLR to exit afer 'IF' and take it as a token. But instead the IF is part of the ID. Is

terminal/datatype/parser rules in xtext

阅读更多关于 terminal/datatype/parser rules in xtext

I'm using xtext 2.4. What I want to do is a SQL-like syntax. The things confuse me are I'm not sure which things should be treated as terminal/datatype/parser rules. So far my grammar related to MyTerm is: Model: (terms += MyTerm ';')* ; MyTerm: constant=MyConstant | variable?='?'| collection_literal=CollectionLiteral ; MyConstant : string=STRING | number=MyNumber | date=MYDATE | uuid=UUID | boolean=MYBOOLEAN | hex=BLOB ; MyNumber: int=SIGNINT | float=SIGNFLOAT ; SIGNINT returns ecore::EInt: '-'? INT ; SIGNFLOAT returns ecore::EFloat: '-'? INT '.' INT; ; CollectionLiteral: => MapLiteral |

ANTLR: Unicode Character Scanning

阅读更多关于 ANTLR: Unicode Character Scanning

问题 Problem: Can't get Unicode character to print correctly. Here is my grammar: options { k=1; filter=true; // Allow any char but \uFFFF (16 bit -1) charVocabulary='\u0000'..'\uFFFE'; } ANYCHAR :'$' | '_' { System.out.println("Found underscore: "+getText()); } | 'a'..'z' { System.out.println("Found alpha: "+getText()); } | '\u0080'..'\ufffe' { System.out.println("Found unicode: "+getText()); } ; Code snippet of main method invoking the lexer: public static void main(String[] args) { SimpleLexer

How to implement a function call with Antlr so that it can be called even before it is defined?

阅读更多关于 How to implement a function call with Antlr so that it can be called even before it is defined?

问题 Once the AST is built, what is the best way implement the tree walker so that functions can be defined and called in whatever order? For example, this is valid in PHP: <?php f(); // function called before it’s defined function f() { print 3; } ?> I’m guessing that somehow there must be a second pass, or a tree transformation, but I can’t find anything interesting on this subject. The problem is probably not an Antlr-specific one, but if you could point me to an Antlr example of how this is

Converting Antlr syntax tree into useful objects

阅读更多关于 Converting Antlr syntax tree into useful objects

I'm currently pondering how best to take an AST generated using Antlr and convert it into useful objects which I can use in my program. The purpose of my grammar (apart from learning) is to create an executable (runtime interpretted) language. For example, how would I take an attribute sub-tree and have a specific Attribute class instanciated. E.g. The following code in my language: Print(message:"Hello stackoverflow") would product the following AST: My current line of thinking is that a factory class could read the tree, pull out the name ( message ), and type( STRING ) value(" Hello