Why does parser generated by ANTLR reuse context objects?

﹥>﹥吖頭↗ 提交于 2019-12-12 01:19:29

问题


I'm trying to create an interpreter for a simple programming language using ANTLR. I would like to add the feature of recursion.

So far I have implemented the definition and calling functions with option of using several return statements and also local variables. To achieve having local variables I extended the parser partial class of FunctionCallContext with a dictionary for them. I can successfully use them for one time. Also, when I call the same function again from itself (recursively), the parser creates a new context object for the new function call, as I would expect. However,if I create a "deeper" recursion, the third context of the function call will be the very same as the second (having the same hash code and the same local variables).

My (updated) grammar:

    grammar BatshG;
/*
 * Parser Rules
 */
compileUnit: ( (statement) | functionDef)+;
statement:      print  ';'
            |   println ';'
            |   assignment ';'
            |   loopWhile
            |   branch 
            |   returnStatement ';'
            |   functionCall ';'
;

branch:
          'if' '(' condition=booleanexpression ')' 
                trueBranch=block 
            ('else' falseBranch=block)?;
loopWhile:
          'while' '(' condition=booleanexpression ')' 
                whileBody=block 
;

block: 
                statement
            |   '{' statement* '}';

numericexpression:      
                MINUS onepart=numericexpression               #UnaryMinus
            |   left=numericexpression op=('*'|'/') right=numericexpression  #MultOrDiv
            |   left=numericexpression op=('+'|'-') right=numericexpression  #PlusOrMinus
            |   number=NUMERIC                                      #Number
            |   variableD                                       #NumVariable
;
stringexpression: 
                left=stringexpression PLUSPLUS right=stringexpression #Concat   
            |   string=STRING #String
            |   variableD                                       #StrVariable
            |   numericexpression #NumberToString
;
booleanexpression:
                        left=numericexpression relationalOperator=('<' | '>' | '>=' | '<=' | '==' | '!=' ) right=numericexpression #RelationalOperation
                    |   booleanliteral #Boolean
                    |   numericexpression #NumberToBoolean
;
booleanliteral: trueConst | falseConst ;
trueConst :  'true'          ;
falseConst :  'false'                ;

assignment : varName=IDENTIFIER EQUAL  right=expression;
expression: numericexpression | stringexpression | functionCall | booleanexpression;
println: 'println' '(' argument=expression ')'; 
print: 'print' '(' argument=expression ')';

functionDef: 'function' funcName= IDENTIFIER
                        '(' 
                            (functionParameters=parameterList)?
                        ')'
                            '{' 
                                statements=statementPart?
                            '}' 
;

statementPart:  statement* ;
returnStatement: ('return' returnValue=expression );
parameterList : paramName=IDENTIFIER (',' paramName=IDENTIFIER)*;

functionCall: funcName=IDENTIFIER '(' 
            (functionArguments=argumentList)?
')';
argumentList: expression (',' expression)*;
variableD: varName=IDENTIFIER;
///*
// * Lexer Rules
// */
NUMERIC: (FLOAT | INTEGER);
PLUSPLUS: '++';
MINUS: '-';
IDENTIFIER: [a-zA-Z_][a-zA-Z0-9_]* ;

EQUAL  :            '='              ;

STRING : '"' (~["\r\n] | '""')* '"'          ;

INTEGER: [0-9] [0-9]*;
DIGIT : [0-9]                       ;
FRAC : '.' DIGIT+                   ;
EXP : [eE] [-+]? DIGIT+  ;
FLOAT : DIGIT* FRAC EXP?             ;
WS: [ \n\t\r]+ -> channel(HIDDEN);

    ///*
    // * Lexer Rules
    // */
    NUMERIC: (FLOAT | INTEGER);
    PLUSPLUS: '++';
    MINUS: '-';
    IDENTIFIER: [a-zA-Z_][a-zA-Z0-9_]* ;
    EQUAL  :            '='              ;
    STRING : '"' (~["\r\n] | '""')* '"'          ;
    INTEGER: [0-9] [0-9]*;
    DIGIT : [0-9]                       ;
    FRAC : '.' DIGIT+                   ;
    EXP : [eE] [-+]? DIGIT+  ;
    FLOAT : DIGIT* FRAC EXP?             ;
    WS: [ \n\t\r]+ -> channel(HIDDEN);

My partial class of parser written by me (not the generated part):

    public partial class BatshGParser
{
    //"extensions" for contexts:
    public partial class FunctionCallContext
    {
        private Dictionary<string, object> localVariables = new Dictionary<string, object>();
        private bool isFunctionReturning;
        public FunctionCallContext()
        {
           localVariables = new Dictionary<string, object>();
           isFunctionReturning = false;
        }

        public Dictionary<string, object> LocalVariables { get => localVariables; set => localVariables = value; }
        public bool IsFunctionReturning { get => isFunctionReturning; set => isFunctionReturning = value; }
    }

    public partial class FunctionDefContext
    {
        private List<string> parameterNames;

        public FunctionDefContext()
        {
            parameterNames = new List<string>();
        }
        public List<string> ParameterNames { get => parameterNames; set => parameterNames = value; }
    }
}

And relevant parts (and maybe a little more) of my visitor:

         public class BatshGVisitor : BatshGBaseVisitor<ResultValue>
    {
        public ResultValue Result { get; set; }
        public StringBuilder OutputForPrint { get; set; }
        private Dictionary<string, object> globalVariables = new Dictionary<string, object>();
        //string = function name
        //object = parameter list
        //object =  return value
        private Dictionary<string, Func<List<object>, object>> globalFunctions = new Dictionary<string, Func<List<object>, object>>();
        private Stack<BatshGParser.FunctionCallContext> actualFunctions = new Stack<BatshGParser.FunctionCallContext>();

        public override ResultValue VisitCompileUnit([NotNull] BatshGParser.CompileUnitContext context)
        {
            OutputForPrint = new StringBuilder("");

            isSearchingForFunctionDefinitions = true;
            var resultvalue = VisitChildren(context);
            isSearchingForFunctionDefinitions = false;
            resultvalue = VisitChildren(context);
            Result = new ResultValue() { ExpType = "string", ExpValue = resultvalue.ExpValue ?? null };

            return Result;
        }
        public override ResultValue VisitChildren([NotNull] IRuleNode node)
        {
            if (this.isSearchingForFunctionDefinitions)
            {
                for (int i = 0; i < node.ChildCount; i++)
                {
                    if (node.GetChild(i) is BatshGParser.FunctionDefContext)
                    {
                        Visit(node.GetChild(i));
                    }
                }
            }
            return base.VisitChildren(node);
        }
        protected override bool ShouldVisitNextChild([NotNull] IRuleNode node, ResultValue currentResult)
        {
            if (isSearchingForFunctionDefinitions)
            {
                if (node is BatshGParser.FunctionDefContext)
                {
                    return true;
                }
                else
                    return false;
            }
            else
            {
                if (node is BatshGParser.FunctionDefContext)
                {
                    return false;
                }
                else
                    return base.ShouldVisitNextChild(node, currentResult);
            }
        }

        public override ResultValue VisitFunctionDef([NotNull] BatshGParser.FunctionDefContext context)
        {

            string functionName = null;
            functionName = context.funcName.Text;
            if (context.functionParameters != null)
            {
                List<string> plist = CollectParamNames(context.functionParameters);
                context.ParameterNames = plist;
            }
            if (isSearchingForFunctionDefinitions)
                globalFunctions.Add(functionName,
                (
                delegate(List<object> args)
                    {
                        var currentMethod = (args[0] as BatshGParser.FunctionCallContext);
                        this.actualFunctions.Push(currentMethod);
                        //args[0] is the context
                        for (int i = 1; i < args.Count; i++)
                        {

                            currentMethod.LocalVariables.Add(context.ParameterNames[i - 1],
                                (args[i] as ResultValue).ExpValue
                                );
                        }

                    ResultValue retval = null;
                        retval = this.VisitStatementPart(context.statements);

                        this.actualFunctions.Peek().IsFunctionReturning = false;
                        actualFunctions.Pop();
                        return retval;
                    }
                 )
            );
            return new ResultValue()
            {

            };
        }       

        public override ResultValue VisitStatementPart([NotNull] BatshGParser.StatementPartContext context)
        {
            if (!this.actualFunctions.Peek().IsFunctionReturning)
            {
                return VisitChildren(context);
            }
            else
            {
                return null;
            }
        }

        public override ResultValue VisitReturnStatement([NotNull] BatshGParser.ReturnStatementContext context)
        {
            this.actualFunctions.Peek().IsFunctionReturning = true;
            ResultValue retval = null;
            if (context.returnValue != null)
            {
                retval = Visit(context.returnValue);
            }

            return retval;
        }

                public override ResultValue VisitArgumentList([NotNull] BatshGParser.ArgumentListContext context)
        {
            List<ResultValue> argumentList = new List<ResultValue>();
            foreach (var item in context.children)
            {
                var tt = item.GetText();
                if (item.GetText() != ",")
                {
                    ResultValue rv = Visit(item);
                    argumentList.Add(rv);
                }
            }
            return
                new ResultValue()
                {
                    ExpType = "list",
                    ExpValue = argumentList ?? null
                };
        }

        public override ResultValue VisitFunctionCall([NotNull] BatshGParser.FunctionCallContext context)
        {
            string functionName = context.funcName.Text;
            int hashcodeOfContext = context.GetHashCode();
            object functRetVal = null;
            List<object> argumentList = new List<object>()
            {
                context
                //here come the actual parameters later
            };

            ResultValue argObjects = null;
            if (context.functionArguments != null)
            {
                argObjects = VisitArgumentList(context.functionArguments);
            }

            if (argObjects != null )
            {
                if (argObjects.ExpValue is List<ResultValue>)
                {
                    var argresults = (argObjects.ExpValue as List<ResultValue>) ?? null;
                    foreach (var arg in argresults)
                    {
                        argumentList.Add(arg);
                    }
                }
            }

            if (globalFunctions.ContainsKey(functionName))
            {
                {
                    functRetVal = globalFunctions[functionName]( argumentList );
                }
            }

            return new ResultValue()
            {
                ExpType = ((ResultValue)functRetVal).ExpType,
                ExpValue = ((ResultValue)functRetVal).ExpValue
            };
        }

        public override ResultValue VisitVariableD([NotNull] BatshGParser.VariableDContext context)
        {

            object variable;
            string variableName = context.GetChild(0).ToString();
            string typename = "";

            Dictionary<string, object> variables = null;
            if (actualFunctions.Count > 0)
            {
                Dictionary<string, object> localVariables = 
                    actualFunctions.Peek().LocalVariables;
                if (localVariables.ContainsKey(variableName))
                {
                    variables = localVariables;
                }
            }
            else
            {
                variables = globalVariables;
            }

            if (variables.ContainsKey(variableName))
            {
                variable = variables[variableName];

                typename = charpTypesToBatshTypes[variable.GetType()];
            }
            else
            {

                Type parentContextType = contextTypes[context.parent.GetType()];
                typename = charpTypesToBatshTypes[parentContextType];
                variable = new object();

                if (typename.Equals("string"))
                {
                    variable = string.Empty;
                }
                else
                {
                    variable = 0d;
                }
            }

            return new ResultValue()
            {
                ExpType = typename,
                ExpValue = variable
            };
        }           
        public override ResultValue VisitAssignment([NotNull] BatshGParser.AssignmentContext context)
        {
            string varname = context.varName.Text;
            ResultValue varAsResultValue = Visit(context.right);
            Dictionary<string, object> localVariables = null;

            if (this.actualFunctions.Count > 0)
            {
                localVariables = 
                    actualFunctions.Peek().LocalVariables;
                if (localVariables.ContainsKey(varname))
                {
                    localVariables[varname] = varAsResultValue.ExpValue;
                }
                else
                if (globalVariables.ContainsKey(varname))
                {
                    globalVariables[varname] = varAsResultValue.ExpValue;
                }
                else
                {
                    localVariables.Add(varname, varAsResultValue.ExpValue);
                }
            }
            else
            {
                if (globalVariables.ContainsKey(varname))
                {
                    globalVariables[varname] = varAsResultValue.ExpValue;
                }
                else
                {
                    globalVariables.Add(varname, varAsResultValue.ExpValue);
                }
            }
            return varAsResultValue;
        }
}

What could cause the problem? Thank you!


回答1:


Why does parser generated by ANTLR reuse context objects?

It doesn't. Each function call in your source code will correspond to exactly one FunctionCallContext object and those will be unique. They'd have to be, even for two entirely identical function calls, because they also contain meta data, such as where in the source the function call appears - and that's obviously going to differ between calls even if everything else is the same.

To illustrate this, consider the following source code:

function f(x) {
  return f(x);
}
print(f(x));

This will create a tree containing exactly two FunctionCallContext objects - one for line 2 and one for line 4. They will both be distinct - they'll both have child nodes referring to the function name f and the argument x, but they'll have different location information and a different hash code - as will the child nodes. Nothing is being reused here.

What could cause the problem?

The fact that you're seeing the same node multiple times is simply due to the fact that you're visiting the same part of the tree multiple times. That's a perfectly normal thing to do for your use case, but in your case it causes a problem because you stored mutable data in the object, assuming that you'd get a fresh FunctionCall object for each time a function call happens at run time - rather than each time a function call appears in the source code.

That's not how parse trees work (they represent the structure of the source code, not the sequence of calls that might happen at run time), so you can't use FunctionCallContext objects to store information about a specific run-time function call. In general, I'd consider it a bad idea to put mutable state into context objects.

Instead you should put your mutable state into your visitor object. For your specific problem that means having a call stack containing the local variables of each run-time function call. Each time a function starts execution, you can push a frame onto the stack and each time a function exits, you can pop it. That way the top of the stack will always contain the local variables of the function currently being executed.


PS: This is unrelated to your problem, but the usual rules of precedence in arithmetic expressions are such that, + has the same precedence as - and * has the same precedence as /. In your grammar the precedence of / is greater than that of * and that of - higher than +. This means that for example 9 * 5 / 3 is going to evaluate to 5, when it should be 15 (assuming the usual rules for integer arithmetic).

To fix this + and -, as well as * and / should be part of the same rule, so they get the same precedence:

| left=numericexpression op=('*'|'/') right=numericexpression #MulOrDiv
| left=numericexpression op=('+'|'-') right=numericexpression #PlusOrMinus


来源:https://stackoverflow.com/questions/53724871/why-does-parser-generated-by-antlr-reuse-context-objects

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!