antlr4

Handling line feed in ANTLR4 grammar with Python target

青春壹個敷衍的年華 提交于 2019-12-24 19:44:35
问题 I am working on an ANTLR4 grammar for parsing Python DSL scripts (a subset of Python, basically) with the target set as the Python 3 . I am having difficulties handling the line feed. In my grammar, I use lexer::members and NEWLINE embedded code based on Bart Kiers's Python3 grammar for ANTLR4 which are ported to Python so that they can be used with Python 3 runtime for ANTLR instead of Java. My grammar differs from the one provided by Bart (which is almost the same used in the Python 3 spec)

Wrong rule matched

别说谁变了你拦得住时间么 提交于 2019-12-24 18:52:25
问题 I have the following in the lexer INTEGER : DIGIT+; NOT: '!'; MINUS:'-'; PLUS:'+'; fragment DIGIT: '0'..'9'; I have the following in the parser expr: intLiteral | UnaryOp expr; intLiteral: (PLUS|MINUS)? INTEGER; UnaryOp: NOT|MINUS; When I use grun to test it with -2, I get it being matched to UnaryOp expr instead of just intLiteral. In other words, the minus sign is being detected as a UnaryOp. Why would this be occuring and is there a way to fix it? 回答1: The practice is to use all CAPITALS

antlr4 grammar for splitting the number

≯℡__Kan透↙ 提交于 2019-12-24 18:17:22
问题 grammar Te; /* * Parser Rules */ test : (example+ EOF); example: digit COMMA digit2 NEWLINE; digit: (INT)+? ; digit2: (INT INT INT INT)+?; /* * Lexer Rules */ INT :[0-9]; COMMA: ','; NEWLINE : ('\r'? '\n' | '\r')+ ; This is the grammar I have written for considering the number sequence into single digit until a COMMA is detected and afterwards consider the number sequence into 4 digits for example, let my input be 00000,12345678912345678912 now it should consider 00000 and split it into

ANTLR4 - Generate code from non-file inputs?

霸气de小男生 提交于 2019-12-24 17:26:13
问题 Where do we start to manually build a CST from scratch? Or does ANTLR4 always require the lex/parse process as our input step? I have some visual elements in my program that represent code structures. e.g. a square represents a class, while a circle embedded within that square represents a method. Now I want to turn those into code. How do I use ANTLR4 to do this, at runtime (using ANTLR4.js)? Most of the ANTLR examples seem to rely on lexing and parsing existing code to get to a syntax tree.

Canonicalizing token text in ANTLR

那年仲夏 提交于 2019-12-24 17:00:11
问题 Is there a way in ANTLR to mark certain tokens as having canonical output? For example, given the grammar (excerpt) words : FOO BAR BAZ FOO : [Ff] [Oo] [Oo] BAR : [Bb] [Aa] [Rr] BAZ : [Bb] [Aa] [Zz] SP : [ ] -> channel(HIDDEN); words will match "FOO BAR BAZ", "foo bar baz", "Foo bAr baZ", etc. When I call TokenStream#getText(Context) , it'll return the tokens' actual text concatenated together. Is there a way to "canonicalize" this output such that no matter what the input, all FOO tokens

Are there any good examples to references where setBuildParseTree = false?

雨燕双飞 提交于 2019-12-24 13:27:38
问题 I'm using an antlr for a simple CSV parser. I'd like to use it on a 29gig file, but it runs out of memory on the ANTLRInputStream call: CharStream cs = new ANTLRInputStream(new BufferedInputStream(input,8192)); CSVLexer lexer = new CSVLexer(cs); CommonTokenStream tokens = new CommonTokenStream(lexer); CSVParser parser = new CSVParser(tokens); ParseTree tree = parser.file(); ParseTreeWalker walker = new ParseTreeWalker(); walker.walk(myListener, tree); I tried to change it to be an unbuffered

Error when generating a grammar for chess PGN files

纵然是瞬间 提交于 2019-12-24 13:11:19
问题 I made this ANTLR4 grammar in order to parse a PGN inside my Java programm, but I can't manage to solve the ambiguity in it : grammar Pgn; file: game (NEWLINE+ game)*; game: (tag+ NEWLINE+)? notation; tag: [TAG_TYPE "TAG_VALUE"]; notation: move+ END_RESULT?; move: MOVE_NUMBER\. MOVE_DESC MOVE_DESC #CompleteMove | MOVE_NUMBER\. MOVE_DESC #OnlyWhiteMove | MOVE_NUMBER\.\.\. MOVE_DESC #OnlyBlackMove ; END_RESULT: '1-0' | '0-1' | '1/2-1/2' ; TAG_TYPE: LETTER+; TAG_VALUE: .*; MOVE_NUMBER: DIGIT+;

ANTLR : How to parse fixed length text file based on index position using ANTLR 4?

烂漫一生 提交于 2019-12-24 11:22:47
问题 Input: 101 04200001312345678981107291600A094101US FORD NA TEST COMPANY101 5225TEST COMPANY 11234567898PPDTEST BUYS 110801110801 1098765430000001 Above lines are 94 char fixed length. Expected output: Based on this input , Antlr grammar should parse based on index positions. For Example: If parser identify '1' in starting char of line one. It should recognize entire line as a separate string as HEADER1 . Same as if parser finds '5' in starting index of line two. It should recognize entire line

NegativeArraySizeException ANTLRv4

℡╲_俬逩灬. 提交于 2019-12-24 10:39:41
问题 I have a 10gb file and I need to parse it in Java, whereas the following error arises when I attempt to do this. java.lang.NegativeArraySizeException at java.util.Arrays.copyOf(Arrays.java:2894) at org.antlr.v4.runtime.ANTLRInputStream.load(ANTLRInputStream.java:123) at org.antlr.v4.runtime.ANTLRInputStream.<init>(ANTLRInputStream.java:86) at org.antlr.v4.runtime.ANTLRInputStream.<init>(ANTLRInputStream.java:82) at org.antlr.v4.runtime.ANTLRInputStream.<init>(ANTLRInputStream.java:90) How can

In antlr visitor pattern how to navigate from one method to another

梦想的初衷 提交于 2019-12-24 10:39:07
问题 I am a newbie to Antlr I wanted to know how to navigate from one parse the enter each method and I wanted the below implementation to be done using Antlr4. I am having the below-written functions. Below is the github link of project. https://github.com/VIKRAMAS/AntlrNestedFunctionParser/tree/master 1. FUNCTION.add(Integer a,Integer b) 2. FUNCTION.concat(String a,String b) 3. FUNCTION.mul(Integer a,Integer b) And I am storing the functions metadata like this. Map<String,String> map=new HashMap