antlr

ANTLR lexer rule consumes too much

喜欢而已 提交于 2019-12-24 08:27:46
问题 ANTLR Lexer Rule Design I have a requirement for the following token: Allowable characters include uppercase, lowercase, numeric, space, and hyphen characters Unfixed length (must be at least two characters in length) Token must contain at least one space or hyphen Token must start and end in an uppercase, lowercase, numeric, space, or hyphen character (cannot begin or end with a space) The ANTLR lexer rule "AlphaNumericSpaceHyphen" in the grammar below almost works except for one case. Using

antlr3 remove treenode with subtree

怎甘沉沦 提交于 2019-12-24 07:47:28
问题 i try to do some tree to tree transform with antlr3.4 It's (for this question) about boolean expressions were "AND" and "OR" are allowed to bind to n expressions. The parser stage creates something like this (OR (AND (expr1) (expr2) (expr3) (OR (AND (expr4)) (AND (expr5)) (AND (expr6)) ) ) ) Unfortunately there are AST nodes for "AND" and "OR" that bind just to one expression. (Which is useless, but hey - rules andExpr and orExpr are invoked) I tried to kick them out (mean, replace them by

ANTLR Lua long string grammar rules

旧巷老猫 提交于 2019-12-24 07:39:20
问题 I'm trying to create ANTLR parser for Lua. So i took grammar produced by Nicolai Mainero(available at ANTLR's site, Lua 5.1 grammar) and begin to work. Grammar is good. One thing not working: LONG STRINGS. Lua specification rule: Literal strings can also be defined using a long format enclosed by long brackets. We define an opening long bracket of level n as an opening square bracket followed by n equal signs followed by another opening square bracket. So, an opening long bracket of level 0

Antlr - Parsing Multiline #define for C.g4

我是研究僧i 提交于 2019-12-24 07:37:40
问题 I am using Antlr4 to parse C code. I want to parse multiline #defines alongwith C.g4 provided in C.g4 But the grammar mentioned in the link above does not support preprocessor directives, so I have added the following new rules to support preprocessing. Link to my previous question Whitespace : [ \t]+ -> channel(HIDDEN) ; Newline : ( '\r' '\n'? | '\n' ) -> channel(HIDDEN) ; BlockComment : '/*' .*? '*/' ; LineComment : '//' ~[\r\n]* ; IncludeBlock : '#' Whitespace? 'include' ~[\r\n]* ;

my lexer token action is not invoked

给你一囗甜甜゛ 提交于 2019-12-24 05:36:10
问题 I use antlr4 with javascript target. Here is a sample grammar: P : T ; T : [a-z]+ {console.log(this.text);} ; start: P ; When I run the generated parser, nothing is printed, although the input is matched. If I move the action to the token P , then it gets invoked. Why is that? 回答1: Actions are ignored in referenced rules. This was the original behavior of ANTLR 4, back when the lexer only supported a single action per token (and that action must appear at the end of the token). Several

How can I parse a special character differently in two terminal rules using antlr?

懵懂的女人 提交于 2019-12-24 03:23:08
问题 I have a grammar that uses the $ character at the start of many terminal rules, such as $video{ , $audio{ , $image{ , $link{ and others that are like this. However, I'd also like to match all the $ and { and } characters that don't match these rules too. For example, my grammar does not properly match $100 in the CHUNK rule, but adding the $ to the long list of acceptable characters in CHUNK causes the other production rules to break. How can I change my grammar so that it's smart enough to

ANTLR generating empty conditions

一笑奈何 提交于 2019-12-24 02:26:10
问题 I'm trying to learn to use ANTLR, but I cannot figure out what's wrong with my code in this case. I hope this will be really easy for anyone with some experience with it. This is the grammar (really short). grammar SmallTest; @header { package parseTest; import java.util.ArrayList; } prog returns [ArrayList<ArrayList<String>> all] :(stat { if ($all == null) $all = new ArrayList<ArrayList<String>>(); $all.add($stat.res); } )+ ; stat returns [ArrayList<String> res] :(element { if ($res == null)

Migration tool for ANTLR grammar

蹲街弑〆低调 提交于 2019-12-24 01:54:44
问题 Suppose I have a following simple grammar (query DSL): grammar TestGrammar; term : textTerm ; textTerm : 'Text' '(' T_VALUE '=' STRING+ ')' ; T_VALUE : 'value' ; STRING : '"' .+? '"' ; WS : [ \t\r\n]+ -> skip ; Then at some point I decide that text term format needs to be changed, for example: Text(value = "123") -> MyText(val = "123") How should I approach migrating existing data that users have generated with previous version of grammar? 回答1: Assumption Let's make one simplification of your

Antlrworks - extraneous input

送分小仙女□ 提交于 2019-12-24 01:37:21
问题 I am new in this stuff, and for that reason I will need your help.. I am trying to parse the Wikipedia Dump, and my first step is to map each rule defined by them into ANTLR, unfortunally I got my first barrier: line 1:8 extraneous input ''''' expecting '\'\'' I am not understanding what is going on, please lend me your help. My code: grammar Test; options { language = Java; } parse : term+ EOF ; term : IDENT | '[[' term ']]' | '\'\'' term '\'\'' | '\'\'\'' term '\'\'\'' ; IDENT : ('a'..'z' |

ANTLR: Lexer does not recognize token

半腔热情 提交于 2019-12-24 01:24:30
问题 Given the following Lexer grammar: lexer grammar CodeTableLexer; CodeTabHeader : '[code table 1.0]'; Code : 'code'; Table : 'table'; End : 'end'; Row : 'row'; Naming : 'naming'; Dfltlang : 'dfltlang'; Language : 'english' | 'german' | 'french' | 'italian' | 'spanish'; Null : 'null'; Number : Int ('.' Digit*)? ; Identifier : ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '$' | '.' | Digit)* ; String @after { setText(getText().substring(1, getText().length() - 1).replaceAll("\\\\(.)",