antlr3

ANTLR v3: order token to improve performance of tree walker

梦想的初衷 提交于 2019-12-25 02:37:12
问题 Is it somehow possible to specify the order of the tokens generated by ANTLR v3? My goal is to order the tokens in such a way that valid tokens in my rule "expression" follow each other in order that conditions in the tree walkers (which have about 90 different branches) can be simplified to one branch. Something like if(LA18_0 >= ARRAY_ACCESS && LA18_0 <= VariableId){} ANTLR assigns the values to the tokens in alphabetical order. That means, the token beginning with an "a" has the lowest

ANTLR3 Dynamic quotes in lexer

流过昼夜 提交于 2019-12-24 16:32:57
问题 I need to match something like the Perl regexp matcher m/my regex!*/ where the quotes can be any character from a range. So the above is the same as m%my regex!*% A naive guess of a lexer rule would be REGEX: 'm' quote=. (~(quote))* quote; but that does not work, because the latter quote is not referring to the quote= but to some rule. I can do it with a lot of own code, like REGEX: 'm' quote=. { ... implement the loop and final match myself ... } ; but somehow I think there should be a

Efficiently replacing a string or character from file-input for the ANTLRInputStream (ANTLRStringStream)

喜你入骨 提交于 2019-12-24 10:47:54
问题 As I described in Antlr greedy-option I have some problems with a language that could include string-literals inside a string-literal, such as: START: "img src="test.jpg"" Mr. Bart Kiers mentioned in my thread that it is not possible to create a grammar which could solve my problem. Therefore I decided to change the language to: START: "img src='test.jpg'" before starting the lexer (and parser). File-input could be: START: "aaa"aaa" "aaa"aaaaa" :END_START START: "aaa"aaa" "aaa"aa a aa" :END

ANTLR parse assignments

北城余情 提交于 2019-12-24 09:22:38
问题 I want to parse some assignments, where I only care about the assignment as a whole. Not about whats inside the assignment. An assignment is indiciated by ':=' . (EDIT: Before and after the assignments other things may come) Some examples: a := TRUE & FALSE; c := a ? 3 : 5; b := case a : 1; !a : 0; esac; Currently I make a difference between assignments containing a 'case' and other assignments. For simple assignments I tried something like ~('case' | 'esac' | ';') but then antlr complained

ANTLR lexer rule consumes too much

喜欢而已 提交于 2019-12-24 08:27:46
问题 ANTLR Lexer Rule Design I have a requirement for the following token: Allowable characters include uppercase, lowercase, numeric, space, and hyphen characters Unfixed length (must be at least two characters in length) Token must contain at least one space or hyphen Token must start and end in an uppercase, lowercase, numeric, space, or hyphen character (cannot begin or end with a space) The ANTLR lexer rule "AlphaNumericSpaceHyphen" in the grammar below almost works except for one case. Using

antlr3 remove treenode with subtree

怎甘沉沦 提交于 2019-12-24 07:47:28
问题 i try to do some tree to tree transform with antlr3.4 It's (for this question) about boolean expressions were "AND" and "OR" are allowed to bind to n expressions. The parser stage creates something like this (OR (AND (expr1) (expr2) (expr3) (OR (AND (expr4)) (AND (expr5)) (AND (expr6)) ) ) ) Unfortunately there are AST nodes for "AND" and "OR" that bind just to one expression. (Which is useless, but hey - rules andExpr and orExpr are invoked) I tried to kick them out (mean, replace them by

ANTLR Lua long string grammar rules

旧巷老猫 提交于 2019-12-24 07:39:20
问题 I'm trying to create ANTLR parser for Lua. So i took grammar produced by Nicolai Mainero(available at ANTLR's site, Lua 5.1 grammar) and begin to work. Grammar is good. One thing not working: LONG STRINGS. Lua specification rule: Literal strings can also be defined using a long format enclosed by long brackets. We define an opening long bracket of level n as an opening square bracket followed by n equal signs followed by another opening square bracket. So, an opening long bracket of level 0

Semantic Predicates antlr don't recognize chain of integers of width 4

我的未来我决定 提交于 2019-12-24 02:30:29
问题 I need to recognize arrays of integers in Fortran's I4 format (stands for an integer of width four) as the following example: Using a pure context-free grammar: WS : ' ' ; MINUS : '-' ; DIGIT : '0'..'9' ; int4: WS WS (WS| MINUS ) DIGIT | WS (WS| MINUS ) DIGIT DIGIT | (WS| MINUS | DIGIT ) DIGIT DIGIT DIGIT ; numbers : int4*; The above example is correctly matched: However if I use semantic predicates to encode semantic constraints of rule int4 : int4 scope { int n; } @init { $int4::n = 0; } :

ANTLR generating empty conditions

一笑奈何 提交于 2019-12-24 02:26:10
问题 I'm trying to learn to use ANTLR, but I cannot figure out what's wrong with my code in this case. I hope this will be really easy for anyone with some experience with it. This is the grammar (really short). grammar SmallTest; @header { package parseTest; import java.util.ArrayList; } prog returns [ArrayList<ArrayList<String>> all] :(stat { if ($all == null) $all = new ArrayList<ArrayList<String>>(); $all.add($stat.res); } )+ ; stat returns [ArrayList<String> res] :(element { if ($res == null)

ANTLR: Lexer does not recognize token

半腔热情 提交于 2019-12-24 01:24:30
问题 Given the following Lexer grammar: lexer grammar CodeTableLexer; CodeTabHeader : '[code table 1.0]'; Code : 'code'; Table : 'table'; End : 'end'; Row : 'row'; Naming : 'naming'; Dfltlang : 'dfltlang'; Language : 'english' | 'german' | 'french' | 'italian' | 'spanish'; Null : 'null'; Number : Int ('.' Digit*)? ; Identifier : ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '$' | '.' | Digit)* ; String @after { setText(getText().substring(1, getText().length() - 1).replaceAll("\\\\(.)",