ANTLR4: TokenStreamRewriter output doesn't have proper format (removes whitespaces)

问题

I am using Antlr4 and java7 grammar (source) for modifying an input Java Source file. More specifically, I am using the TokenStreamRewriter class to modify some tokens. The following code is a sample that shows how the tokens are modified:

public class TestListener extends JavaBaseListener {    
   private TokenStreamRewriter rewriter;
   rewriter = new TokenStreamRewriter(tokenStream);
   rewriter.replace(ctx.getStart(), ctx.getStop(), "someText");
}

When I print the altered source code, the white spaces and tabs are removed and the new source file's format is like this:

importjava.util.ArrayList;publicclassMain{publicstaticvoidmain(String[]args{MyTimertimer=newMyTimer();}}

I am using extractor.getText() for printing it back.

Is this a problem of the grammar used or should I use some other method from the TokenStreamRewriter class?

回答1:

The issue is that the lexer is not sending white space to the parser, which means that the rewrite stream doesn't have access to the tokens either. It is because of the skip lexer command:

WS : [ \t\r\n\u000C]+ -> skip ;

You have to change all those to -> channel(HIDDEN) which will send them to the parser on a different channel, making them available in the token stream, but invisible to the parser.

来源：https://stackoverflow.com/questions/21889071/antlr4-tokenstreamrewriter-output-doesnt-have-proper-format-removes-whitespac

标签

parsing

antlr

token

antlr4

lexer

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!