antlr4 | 易学教程

Extracing specific tags from arbitrary plain text

阅读更多关于 Extracing specific tags from arbitrary plain text

问题 I want to parse plain text comments and look for certain tags within them. The types of tags I'm looking for look like: <name#1234> Where "name" is a [a-z] string (from a fixed list) and "1234" represents a [0-9]+ number. These tags can occur within a string zero or more times and be surrounded by arbitrary other text. For example, the following strings are all valid: "Hello <foo#56> world!" "<bar#1>!" "1 < 2" "+<baz#99>+<squid#0> and also<baz#99>.\n\nBy the way, maybe <foo#9876>" The

Antlr4 C++ target

阅读更多关于 Antlr4 C++ target

问题 We're starting a project where we will need to parse python source files in a C++ application. I've used Antlr2 a while back to generate a few compilers, but this is the first time I'm using Antlr4. It looks like the c++ antlr4 target is fairly active at https://github.com/antlr/antlr4-cpp So, my question is basically what is the status of the Antlr4 C++ target, is it ready to start being used? To use the C++ target, what just grab the Antlr4 source, and copy the Antlr4-cpp into this tree and

How can I import an ANTLR lexer grammar into another grammar using Gradle 2.10?

阅读更多关于 How can I import an ANTLR lexer grammar into another grammar using Gradle 2.10?

问题 I've been learning about ANTLR 4 with Terence Parr's The Definitive ANTLR 4 Reference , which I've been following so far using Gradle 2.10 and its built-in ANTLR plugin. However I'm having some trouble getting some code which I adapted from Chapter 4, pp. 38-41 to work properly with my Gradle build script. (The reason I'm using Gradle, rather than ANTLR directly, is because I want to eventually integrate ANTLR into a Java web application which I'm making for my dissertation, and I'd strongly

ANTLR4 Flattening a ParserRuleContext Tree into an Array

阅读更多关于 ANTLR4 Flattening a ParserRuleContext Tree into an Array

问题 How to flatten a ParserRuleContext with subtrees into an array of tokens? The ParserRuleContext.getTokens(int ttype) looks good. but what is ttype ? Is it token type? What value to use if I want to include all token types? 回答1: ParserRuleContext.getTokens(int ttype) only retrieves certain child nodes of a parent: it does not recursively go into the parent-tree. However, it is easy enough to write yourself: /** * Retrieves all Tokens from the {@code tree} in an in-order sequence. * * @param

Antlr4: How can I both hide and use Tokens in a grammar

阅读更多关于 Antlr4: How can I both hide and use Tokens in a grammar

问题 I'm parsing a script language that defines two types of statements; control statements and non control statements. Non control statements are always ended with ';' , while control statements may end with ';' or EOL ('\n'). A part of the grammar looks like this: script : statement* EOF ; statement : control_statement | no_control_statement ; control_statement : if_then_control_statement ; if_then_control_statement : IF expression THEN end_control_statment ( statement ) * ( ELSEIF expression

ANTLR chaining 1 to 1 grammar rules together to solve conditionals

阅读更多关于 ANTLR chaining 1 to 1 grammar rules together to solve conditionals

问题 If you look at the ObjectiveC antlr v3 grammars (http://www.antlr3.org/grammar/1212699960054/ObjectiveC2ansi.g), and many of the other popular grammars out there they do a similar structure to this for solving conditionals conditional_expression : logical_or_expression ('?' logical_or_expression ':' logical_or_expression)? ; constant_expression : conditional_expression ; logical_or_expression : logical_and_expression ('||' logical_and_expression)* ; logical_and_expression : inclusive_or

Slow ANTLR4 generated Parser in Python, but fast in Java

阅读更多关于 Slow ANTLR4 generated Parser in Python, but fast in Java

问题 I am trying to convert ant ANTLR3 grammar to an ANTLR4 grammar, in order to use it with the antlr4-python2-runtime. This grammar is a C/C++ fuzzy parser. After converting it (basically removing tree operators and semantic/syntactic predicates), I generated the Python2 files using: java -jar antlr4.5-complete.jar -Dlanguage=Python2 CPPGrammar.g4 And the code is generated without any error, so I import it in my python project (I'm using PyCharm) to make some tests: import sys, time from antlr4

What is minimal sample Gradle project for ANTLR4 (with antlr plugin)?

阅读更多关于 What is minimal sample Gradle project for ANTLR4 (with antlr plugin)?

问题 I have created new Gradle project, added apply plugin: 'antlr' and dependencies { antlr "org.antlr:antlr4:4.5.3" to build.gradle . Created src/main/antlr/test.g4 file with the following content grammar test; r : 'hello' ID; ID : [a-z]+ ; WS : [ \t\r\n]+ -> skip ; But it doesn't work. No java source files generated (and no error occurred). What I missed? Project is here: https://github.com/dims12/AntlrGradlePluginTest2 UPDATE I found my sample is actually works, but it put code into \build

Parse a formula using ANTLR4

阅读更多关于 Parse a formula using ANTLR4

问题 I am trying to parse a mathematical formula to a subset of LaTeX using ANTLR4. For example it should parse (a+4)/(b*10) to \frac{a+4}{b\cdot 10} . My simple grammar creates a tree like this: Now I am trying to implement parse tree listeners to somehow construct the LaTeX String while the tree is traversed. Here, I am failing because to construct a String like \frac{}{} it has to be built recursively. The parse tree walker, however, visits one tree node after the other (in a breadth-first way

How to write grammer in Antlr4 for function with zero argument

阅读更多关于 How to write grammer in Antlr4 for function with zero argument

问题 I'm having function with arguments grammer like below lexer and parser: MyFunctionsLexer.g4 lexer grammar MyFunctionsLexer; FUNCTION: 'FUNCTION'; NAME: [A-Za-z0-9]+; DOT: '.'; COMMA: ','; L_BRACKET: '('; R_BRACKET: ')'; WS : [ \t\r\n]+ -> skip; MyFunctionsParser.g4 parser grammar MyFunctionsParser; options { tokenVocab=MyFunctionsLexer; } functions : function* EOF; function : FUNCTION '.' NAME '(' (function | argument (',' argument)*) ')'; argument: (NAME | function); But in parser is