antlr

ANTLR: Get token name?

為{幸葍}努か 提交于 2020-08-24 05:36:08
问题 I've got a grammar rule, OR : '|'; But when I print the AST using, public static void Preorder(ITree tree, int depth) { if (tree == null) { return; } for (int i = 0; i < depth; i++) { Console.Write(" "); } Console.WriteLine(tree); for(int i=0; i<tree.ChildCount; ++i) Preorder(tree.GetChild(i), depth + 1); } (Thanks Bart) it displays the actual | character. Is there a way I can get it to say "OR" instead? 回答1: robert inspired this answer. if (ExpressionParser.tokenNames[tree.Type] == tree.Text

What are the ways to speed up parsing in Antlr4?

安稳与你 提交于 2020-08-19 05:34:08
问题 I have some doubts regarding the performance of Antlr4. I am currently using Python with Antlr4. It is terribly slow compared to Java (Verified using Antlr4 IntelliJ plugin). Since I need to work with bigger codes for parsing, I am planning to switch to a language which will be the fastest with Antlr (Eg: Java, C or Python). Any suggestions? Any tips on optimizing the Antlr grammar for faster parsing (I am trying some online resources) If I continue with Python itself, what are the best ways

打破国外垄断,开发中国人自己的编程语言(1):编写解析表达式的计算器

北慕城南 提交于 2020-08-06 21:08:52
-----------支持作者请转发本文----------- 阅读本系列文章将是 “最残酷的头脑风暴,大家做好准备了吗” 本文是《打破国外垄断,开发中国人自己的编程语言》系列文章的第1篇。本系列文章的主要目的是教大家学会如何从零开始设计一种编程语言(marvel语言),并使用marvel语言开发一些真实的项目,如移动App、Web应用等。marvel语言可以通过下面3种方式运行: 1. 解释执行 2. 编译成Java Bytecode,利用JVM执行 3. 编译成二进制文件,本地执行(基于LLVM) 本系列文章实现的marvel语言并不像很多《自己动手》系列一样,做一个玩具。marvel语言是一个工业级的编程语言,与kotlin、Java等语言是同一个级别,设计之初是为了试验编程语言的新特性。我们团队开发的超平台开发系统UnityMarvel内嵌的Ori语言的部分特性也是来源于Marvel。关于UnityMarvel的细节后面会专门写文章介绍。这里先讨论编译器的问题。 1. 如果系统软件受到制约,有没有可能突出重围呢? 我们知道,现在中美贸易战如火如荼,可能以后使用国外很多软件,尤其是系统软件,都会有一些问题。这就需要我们在一些关键领域有自己可以控制的技术和软件,例如,操作系统、编程语言、数据库、科学计算软件等。其实这些种类的软件中,大多都属于基础软件,只有操作系统和编程语言

Using Visitors in AntLR4 in a Simple Integer List Grammar

左心房为你撑大大i 提交于 2020-08-05 05:50:10
问题 I'm a newbie in AntLR. I'm using AntLR4 version. I wrote the following attribute grammar that recognizes a list of integers and print the sum of the list at the end. list.g4 grammar list; @header { import java.util.List; import java.util.ArrayList; } list : BEGL (elems[new ArrayList<Integer>()])? ENDL { int sum = 0; if($elems.text != null) for(Integer i : $elems.listOut) sum += i; System.out.println("List Sum: " + sum); } ; elems [List<Integer> listIn] returns [List<Integer> listOut] : a=elem

Call correct Antlr visitors without call all before visitors ultil match

痞子三分冷 提交于 2020-06-17 13:11:06
问题 I'm a little confused about how to do this. For example, I have this rule: stat : '(' expression ')' #ExpressionStatement; expression1 : expression2 ('==' expression2)* #ValidateExpression1 ; expression2 : literal ('!=' literal)* #ValidateExpression2 ; literal : ( 'true' | 'false') #literal ; In this case, if I pass "(true)" to the parser, how can I just visit VisitListeral() and return true instead visiting the entire tree until the match? In the visitor: VisitExpressionStatement() if I do

Call correct Antlr visitors without call all before visitors ultil match

可紊 提交于 2020-06-17 13:10:05
问题 I'm a little confused about how to do this. For example, I have this rule: stat : '(' expression ')' #ExpressionStatement; expression1 : expression2 ('==' expression2)* #ValidateExpression1 ; expression2 : literal ('!=' literal)* #ValidateExpression2 ; literal : ( 'true' | 'false') #literal ; In this case, if I pass "(true)" to the parser, how can I just visit VisitListeral() and return true instead visiting the entire tree until the match? In the visitor: VisitExpressionStatement() if I do

Is there a way to easily adapt the error messages of ANTLR4?

断了今生、忘了曾经 提交于 2020-06-08 12:01:31
问题 Currenlty I'm working on my own grammar and I would like to have specific error messages on NoViableAlternative , InputMismatch , UnwantedToken , MissingToken and LexerNoViableAltException . I already extended the Lexer.class and have overridden the notifyListeners to change the default error message token recognition error at: to my own one. As well I extended the DefaultErrorStrategy and have overridden all report methods, like reportNoViableAlternative , reportInputMismatch ,

Negated lexer rules/tokens

旧城冷巷雨未停 提交于 2020-05-17 08:49:28
问题 I am trying to match (and ignore) c-style block comments. To me the sequence is (1) /* followed by (2) anything other than /* or */ until (3) */ . BLOCK_COMMENT_START : "/*" ; BLOCK_COMMENT_END : "*/" ; BLOCK_COMMENT : BLOCK_COMMENT_START ( ~( BLOCK_COMMENT_START | BLOCK_COMMENT_END ) )* BLOCK_COMMENT_END { // again, we want to skip the entire match from the lexer stream $setType( Token.SKIP ); } ; But Antlr does not think like I do ;) sql-stmt.g:121:34: This subrule cannot be inverted. Only

ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.7.1

僤鯓⒐⒋嵵緔 提交于 2020-05-13 06:12:50
问题 I'm getting an error in DataBindingMapperImpl.java for one specific data binding which results in the following error when building the project. ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.7.1ANTLR Runtime version 4.5.3 used for parser compilation does not match the current runtime version 4.7.1ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.7.1ANTLR Runtime version 4.5.3 used for parser

Spark SQL源码解析(三)Analysis阶段分析

喜欢而已 提交于 2020-04-29 12:40:17
Spark SQL原理解析前言: Spark SQL源码剖析(一)SQL解析框架Catalyst流程概述 Spark SQL源码解析(二)Antlr4解析Sql并生成树 Analysis阶段概述 首先,这里需要引入一个新概念,前面介绍SQL parse阶段,会使用antlr4,将一条SQL语句解析成语法树,然后使用antlr4的访问者模式遍历生成语法树,也就是Logical Plan。但其实,SQL parse这一阶段生成的Logical Plan是被称为Unresolved Logical Plan。所谓unresolved,就是说SQL语句中的对象都是未解释的。 比如说一条语句 SELECT col FROM sales ,当我们不知道col的具体类型(Int,String,还是其他),甚至是否在sales表中有col这一个列的时候,就称之为是Unresolved的。 而在analysis阶段,主要就是解决这个问题,也就是将Unresolved的变成Resolved的。Spark SQL通过使用Catalyst rule和Catalog来跟踪数据源的table信息。并对Unresolved应用如下的rules(rule可以理解为一条一条的规则,当匹配到树某些节点的时候就会被应用)。 从Catalog中,查询Unresolved Logical Plan中对应的关系