antlr4 | 易学教程

ANTLR4: TokenStreamRewriter output doesn't have proper format (removes whitespaces)

阅读更多关于 ANTLR4: TokenStreamRewriter output doesn't have proper format (removes whitespaces)

问题 I am using Antlr4 and java7 grammar (source) for modifying an input Java Source file. More specifically, I am using the TokenStreamRewriter class to modify some tokens. The following code is a sample that shows how the tokens are modified: public class TestListener extends JavaBaseListener { private TokenStreamRewriter rewriter; rewriter = new TokenStreamRewriter(tokenStream); rewriter.replace(ctx.getStart(), ctx.getStop(), "someText"); } When I print the altered source code, the white spaces

ANTLR4 parse tree simplification

阅读更多关于 ANTLR4 parse tree simplification

Is there any means to get ANTLR4 to automatically remove redundant nodes in generated parse trees? More specifically, I've been experimenting with a grammar for GLSL and you end up with long linear sequences of "expressions" in the parse tree due to the rule forwarding needed to give the automatic handling of operator precedence. Most of the generated tree nodes are simply "forward to the next level of precedence", so don't provide any useful syntactic information - you only really need the last expression node in each sequence (i.e. the point at which the rule forwarding stopped), or the

ANTLRv4: non-greedy rules

阅读更多关于 ANTLRv4: non-greedy rules

I'm reading the definite ANTLR4 reference and have a question regarding one of the examples (p. 76): STRING: '"' (ESC|.)*? '"'; fragment ESC: '\\"' | '\\\\' ; The rule matches a typical C++ string - a char sequence included in "" , which can contain \" too. In my expectation, the rule STRING should match the smallest string possible because of the non-greedy construct. So if it sees a \" it would map \ to . and " to " at the end of the rule, since this would result in the smallest string possible. Instead of this, a \" is mapped to ESC . I have an understanding problem, since it is not what I

Syntax of semantic predicates in Antlr4

阅读更多关于 Syntax of semantic predicates in Antlr4

In What is a 'semantic predicate' in ANTLR3? Bart Kiers gives a very well overview about the different semantic predicates in Antlr3. Too bad the syntax/semantics were seemingly changed in Antlr4, so this does not compile: end_of_statement : ';' | EOF | {input.LT(1).getType() == RBRACE}? => ; RBRACE : '}' ; Could someone please tell me how to do the third case of end_of_statement : Accept if the next token is a '}' but do not consume it. There is now just a single type of semantic predicates, which looks like this: { <<boolean-epxression>> }? And the input attribute from the abstract class

ANTLRv4: How to read double quote escaped double quotes in string?

阅读更多关于 ANTLRv4: How to read double quote escaped double quotes in string?

问题 In ANTLR v4, how do we parse this kind of string with double quote escaped double quotes like in VBA? for text: "some string with ""john doe"" in it" the goal would be to identify the string: some string with "john doe" in it And is it possible to rewrite it to turn double double quotes in single double quotes? "" -> " ? 回答1: Like this: STRING : '"' (~[\r\n"] | '""')* '"' ; where ~[\r\n"] | '""' means: ~[\r\n"] # any char other than '\r', '\n' and double quotes | # OR '""' # two successive

ANTLRv4: How to read double quote escaped double quotes in string?

阅读更多关于 ANTLRv4: How to read double quote escaped double quotes in string?

In ANTLR v4, how do we parse this kind of string with double quote escaped double quotes like in VBA? for text: "some string with ""john doe"" in it" the goal would be to identify the string: some string with "john doe" in it And is it possible to rewrite it to turn double double quotes in single double quotes? "" -> " ? Like this: STRING : '"' (~[\r\n"] | '""')* '"' ; where ~[\r\n"] | '""' means: ~[\r\n"] # any char other than '\r', '\n' and double quotes | # OR '""' # two successive double quotes And is it possible to rewrite it to turn double double quotes in single double quotes? Not

In antlr4 lexer, How to have a rule that catches all remaining “words” as Unknown token?

阅读更多关于 In antlr4 lexer, How to have a rule that catches all remaining “words” as Unknown token?

I have an antlr4 lexer grammar. It has many rules for words, but I also want it to create an Unknown token for any word that it can not match by other rules. I have something like this: Whitespace : [ \t\n\r]+ -> skip; Punctuation : [.,:;?!]; // Other rules here Unknown : .+? ; Now generated matcher catches '~' as unknown but creates 3 '~' Unknown tokens for input '~~~' instead of a single '~~~' token. What should I do to tell lexer to generate word tokens for unknown consecutive characters. I also tried "Unknown: . ;" and "Unknown : .+ ;" with no results. EDIT: In current antlr versions .+?

ANTLR4 Semantic Predicates that is Context Dependent Does Not Work

阅读更多关于 ANTLR4 Semantic Predicates that is Context Dependent Does Not Work

问题 I am parsing a C++ like declaration with this scaled down grammar (many details removed to make it a fully working example). It fails to work mysteriously (at least to me). Is it related to the use of context dependent predicate? If yes, what is the proper way to implement the "counting the number of child nodes logic"? grammar CPPProcessor; cppCompilationUnit : decl_specifier_seq? init_declarator* ';' EOF; init_declarator: declarator initializer?; declarator: identifier; initializer: '=0';

How to automatically generate lexer+parser with ANTLR4 and Maven?

阅读更多关于 How to automatically generate lexer+parser with ANTLR4 and Maven?

I'am new to ANTLR4, and it seems that there is no Eclipse-Plug-In for v4. So it would nice to build automatically the Java sources from the .g4 grammars. I have a simple, empty Maven-project with src/main/java, src/test/java. Where to place the .g4 files? How can I automatically build the grammars with Maven? My own POM-test failed: <repository> <id>mvn-public</id> <name>MVNRepository</name> <url>http://mvnrepository.com</url> </repository> ... <build> <plugins> <plugin> <groupId>org.antlr</groupId> <artifactId>antlr4-maven-plugin</artifactId> <version>4.0.0</version> <executions> <execution>

antlr4-Can't load Hello as lexer or parser

阅读更多关于 antlr4-Can't load Hello as lexer or parser

I recently have to use parser to do a project. I download ANTLR4 and follow the steps described in the book The Definitive ANTLR4 Reference . The following are the steps I type in command line: 1. export CLASSPATH=".:/<Mydirectory>/antlr-4.2.2-complete.jar:$CLASSPATH" 2. alias antlr4='java -jar /<My directory>/antlr-4.2.2-complete.jar' 3. alias grun='java org.antlr.v4.runtime.misc.TestRig' 4. antlr4 Hello.g4 All the things work fine, it generates java files that I need. However, after I enter 5. grun Hello r -tokens It reports Can't load Hello as lexer or parser. I google some info, but still