antlr4

Is it advisable to use tokens for the purpose of syntax highlighting?

老子叫甜甜 提交于 2019-12-03 21:43:06
I'm trying to implement syntax highlighting in C# on Android, using Xamarin. I'm using the ANTLR v4 library for C# to achieve this. My code, which is currently syntax highlighting Java with this grammar , does not attempt to build a parse tree and use the visitor pattern. Instead, I simply convert the input into a list of tokens: private static IList<IToken> Tokenize(string text) { var inputStream = new AntlrInputStream(text); var lexer = new JavaLexer(inputStream); var tokenStream = new CommonTokenStream(lexer); tokenStream.Fill(); return tokenStream.GetTokens(); } Then I loop through all of

ANTLR and Eclipse (or any decent IDE)

筅森魡賤 提交于 2019-12-03 13:10:30
I have been using ANTLR with Eclipse for some time using the ANTLRv3IDE plugin. While it is not perfect, and a bit outdated, it does its job reasonably well. Now I am looking to switch to ANTLRv4 for another DSL that I am creating. However, Eclipse support seems to be extremely thin. I decided to try out ANTLRWorks, which is a NetBeans plugin, but I could not get it to install (it seems to be locked to specific dated versions (201302132200 while I have something newer, still 7.3 as docs say) of dependencies). So, the question: Has anyone set up any Java IDE (preferably Eclipse, but I could be

How do I pretty-print productions and line numbers, using ANTLR4?

纵饮孤独 提交于 2019-12-03 12:53:38
I'm trying to write a piece of code that will take an ANTLR4 parser and use it to generate ASTs for inputs similar to the ones given by the -tree option on grun ( misc.TestRig ). However, I'd additionally like for the output to include all the line number/offset information. For example, instead of printing (add (int 5) '+' (int 6)) I'd like to get (add (int 5 [line 3, offset 6:7]) '+' (int 6 [line 3, offset 8:9]) [line 3, offset 5:10]) Or something similar. There aren't a tremendous number of visitor examples for ANTLR4 yet, but I am pretty sure I can do most of this by copying the default

Is “Implicit token definition in parser rule” something to worry about?

微笑、不失礼 提交于 2019-12-03 11:39:29
问题 I'm creating my first grammar with ANTLR and ANTLRWorks 2. I have mostly finished the grammar itself (it recognizes the code written in the described language and builds correct parse trees), but I haven't started anything beyond that. What worries me is that every first occurrence of a token in a parser rule is underlined with a yellow squiggle saying "Implicit token definition in parser rule". For example, in this rule, the 'var' has that squiggle: variableDeclaration: 'var' IDENTIFIER ('='

Intellij will not recognize antlr generated source code

邮差的信 提交于 2019-12-03 11:19:34
I am having trouble getting Intellij to recognize the generated source code from antlr4. Any reference to the generated code appears as errors, code completion doesn't work, etc. I am using maven and the antlr4-maven-plugin to generate the code. My code, referencing the generated code compiles and builds fine under maven. The generated code is under /target/generated-sources/antlr4, which is what Intellij expects. I have tried the usual fixes such as reimport maven projects, update folders, invalidate cache, etc. None of it seems to work. Anyone seen this before? Is there a way to point to the

ANTLR4: Whitespace handling

老子叫甜甜 提交于 2019-12-03 09:57:22
I have seen many ANTLR grammars that use whitespace handling like this: WS: [ \n\t\r]+ -> skip; // or WS: [ \n\t\r]+ -> channel(HIDDEN); So the whitespaces are thrown away respectively send to the hidden channel. With a grammar like this: grammar Not; start: expression; expression: NOT expression | (TRUE | FALSE); NOT: 'not'; TRUE: 'true'; FALSE: 'false'; WS: [ \n\t\r]+ -> skip; valid inputs are ' not true ' or ' not false ' but also ' nottrue ' which is not a desired result. Changing the grammar to: grammar Not; start: expression; expression: NOT WS+ expression | (TRUE | FALSE); NOT: 'not';

How to configure antlr4 plugin for Intellij IDEA

点点圈 提交于 2019-12-03 04:17:47
问题 I looked all over the place for how to configure the antlr4 plugin for IntelliJ IDEA. But I can't find anything. I was only able to install the plugin. If I add .g4 files manually for a empty project I get the "Generate ANTLR Recognizer" option in right click menu. That is all. I thought It was very promising plugin. Can anyone please tell/direct me how to proceed with the plugin ? Thank you 回答1: I installed the ANTLR plugin on IntelliJ 14 and was able to get it working. A couple little

how to use antlr4 visitor

感情迁移 提交于 2019-12-03 02:37:31
I am a beginner of antlr. I was trying to use visitor in my code and following the instruction on the net. However, I found out that the visitor was not entering the method I create at all. May anyone tell me what I did wrong? This is my visitor: import java.util.LinkedList; import org.antlr.v4.runtime.misc.NotNull; /* * To change this template, choose Tools | Templates * and open the template in the editor. */ /** * * @author Sherwood */ public class ExtractMicroBaseVisitor extends MicroBaseVisitor<Integer> { //LinkedList<IR> ll = new LinkedList<IR>(); //MicroParser parser; //System.out

Antlr4 C# targets and output path of generated files

泪湿孤枕 提交于 2019-12-02 23:50:31
I have a C# solution with an Antlr3 grammar file, and I'm trying to upgrade to Anltr4. It turns out the grammar was the easy part (it became better, and one third the size!). Generating the parser turned out to be the tricky part. In the old solution I merely ran AntlrWorks to update the lexer and parser .cs files when the grammar file changed. The lexer and parser were included directly in the same project as the grammar so the framework around the parser could make use of them directly. With the Antlr4 targets for C# I noticed that (at least by default) the output path of the generated

ANTLR 4 lexer tokens inside other tokens

主宰稳场 提交于 2019-12-02 21:15:46
I have the following grammar for ANTLR 4: grammar Pattern; //parser rules parse : string LBRACK CHAR DASH CHAR RBRACK ; string : (CHAR | DASH)+ ; //lexer rules DASH : '-' ; LBRACK : '[' ; RBRACK : ']' ; CHAR : [A-Za-z0-9] ; And I'm trying to parse the following string ab-cd[0-9] The code parses out the ab-cd on the left which will be treated as a literal string in my application. It then parses out [0-9] as a character set which in this case will translate to any digit. My grammar works for me except I don't like to have (CHAR | DASH)+ as a parser rule when it's simply being treated as a token