ANTLR: Get token name?

為{幸葍}努か 提交于 2020-08-24 05:36:08

问题


I've got a grammar rule,

OR
    : '|';

But when I print the AST using,

public static void Preorder(ITree tree, int depth)
{
    if (tree == null)
    {
        return;
    }

    for (int i = 0; i < depth; i++)
    {
        Console.Write("  ");
    }

    Console.WriteLine(tree);

    for(int i=0; i<tree.ChildCount; ++i)
        Preorder(tree.GetChild(i), depth + 1);
}

(Thanks Bart) it displays the actual | character. Is there a way I can get it to say "OR" instead?


回答1:


robert inspired this answer.

if (ExpressionParser.tokenNames[tree.Type] == tree.Text)
    Console.WriteLine(tree.Text);
else
    Console.WriteLine("{0} '{1}'", ExpressionParser.tokenNames[tree.Type], tree.Text);



回答2:


I had to do this a couple of weeks ago, but with the Python ANTLR. It doesn't help you much, but it might help somebody else searching for an answer.

With Python ANTLR, tokens types are integers. The token text is included in the token object. Here's the solution I used:

import antlrGeneratedLexer

token_names = {}
for name, value in antlrGeneratedLexer.__dict__.iteritems():
    if isinstance(value, int) and name == name.upper():
        token_names[value] = name

There's no apparent logic to the numbering of tokens (at least, with Python ANTLR), and the token names are not stored as strings except in the module __dict__, so this is the only way of getting to them.

I would guess that in C# token types are in an enumeration, and I believe enumerations can be printed as strings. But that's just a guess.




回答3:


Boy, I spent way too much time banging my head against a wall trying to figure this out. Mark's answer gave me the hint I needed, and it looks like the following will get the token name from a TerminalNode in Antlr 4.5:

myLexer.getVocabulary.getSymbolicName(myTerminalNode.getSymbol.getType)

or, in C#:

myLexer.Vocabulary.GetSymbolicName(myTerminalNode.Symbol.Type)

(Looks like you can actually get the vocabulary from either the parser or the lexer.)

Those vocabulary methods seem to be the preferred way get at the tokens in Antlr 4.5, and tokenNames appears to be deprecated.

It does seem needlessly complicated for what I think is a pretty basic operation, so maybe there's an easier way.




回答4:


I'm new to Antlr, but it seems ITree has no direct obligation to be related to Parser (in .NET). Instead there is a derived interface IParseTree, returned from Parser (in Antlr4), and it contains few additional methods including override:

string ToStringTree(Parser parser);

It converts the whole node subtree into text representation. For some cases it is useful. If you like to see just the name of some concrete node without it's children, then use static method in class Trees:

public static string GetNodeText(ITree t, Parser recog);

This method does basically the same as Mark and Robert suggested, but in more general and flexible way.



来源:https://stackoverflow.com/questions/4403878/antlr-get-token-name

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!