FMTWIDTH error while using JPMML to evaluate a SAS produced PMML file

不羁的心 提交于 2019-12-08 03:06:08

问题


I have a PMML generated from SAS Miner that I can't get properly evaluated using JPMML 1.1.4. JPMML 1.1.4 says it supports PMML 4.2 and the PMML says it is PMML version 4.2.

Is the FMTWIDTH in the below function "SAS-EM-String-Normalize" proper PMML syntax?

Any ideas why I can't evaluation this function using JPMML?

I have the function in my TransformationDictionary that looks like,

<TransformationDictionary>
    <DefineFunction name="SAS-EM-String-Normalize" optype="categorical" dataType="string">
        <ParameterField name="FMTWIDTH" optype="continuous"/>
        <ParameterField name="AnyCInput" optype="categorical"/>
        <Apply function="trimBlanks">
          <Apply function="uppercase">
            <Apply function="substring">
              <FieldRef field="AnyCInput"/>
              <Constant>1</Constant>
              <Constant>FMTWIDTH</Constant>
            </Apply>
          </Apply>
        </Apply>   
    </DefineFunction>
</TransformationDictionary>

And I get the following exception,

Exception in thread "main" org.jpmml.evaluator.TypeCheckException: Expected INTEGER, but got STRING (FMTWIDTH) at org.jpmml.evaluator.FieldValue.asInteger(FieldValue.java:125) at org.jpmml.evaluator.FunctionRegistry$36.evaluate(FunctionRegistry.java:463) at org.jpmml.evaluator.FunctionUtil.evaluate(FunctionUtil.java:38) at org.jpmml.evaluator.ExpressionUtil.evaluateApply(ExpressionUtil.java:203) at org.jpmml.evaluator.ExpressionUtil.evaluate(ExpressionUtil.java:91) at org.jpmml.evaluator.FunctionUtil.evaluate(FunctionUtil.java:76) at org.jpmml.evaluator.FunctionUtil.evaluate(FunctionUtil.java:43) at org.jpmml.evaluator.ExpressionUtil.evaluateApply(ExpressionUtil.java:203) at org.jpmml.evaluator.ExpressionUtil.evaluate(ExpressionUtil.java:91) at org.jpmml.evaluator.ExpressionUtil.evaluateApply(ExpressionUtil.java:188) at org.jpmml.evaluator.ExpressionUtil.evaluate(ExpressionUtil.java:91) at org.jpmml.evaluator.ExpressionUtil.evaluate(ExpressionUtil.java:58) at org.jpmml.evaluator.ExpressionUtil.evaluate(ExpressionUtil.java:45) at org.jpmml.evaluator.ExpressionUtil.evaluateMapValues(ExpressionUtil.java:169) at org.jpmml.evaluator.ExpressionUtil.evaluate(ExpressionUtil.java:87) at org.jpmml.evaluator.ExpressionUtil.evaluate(ExpressionUtil.java:58) at org.jpmml.evaluator.ExpressionUtil.evaluate(ExpressionUtil.java:45) at org.jpmml.evaluator.RegressionModelEvaluator.evaluateRegressionTable(RegressionModelEvaluator.java:150) at org.jpmml.evaluator.RegressionModelEvaluator.evaluateClassification(RegressionModelEvaluator.java:107) at org.jpmml.evaluator.RegressionModelEvaluator.evaluate(RegressionModelEvaluator.java:57) at org.jpmml.evaluator.ModelEvaluator.evaluate(ModelEvaluator.java:65) at ValidPMMLTesterRandomScores.randomEvaluation(ValidPMMLTesterRandomScores.java:116) at ValidPMMLTesterRandomScores.printModelInformation(ValidPMMLTesterRandomScores.java:94) at ValidPMMLTesterRandomScores.readModelFromFile(ValidPMMLTesterRandomScores.java:142) at ValidPMMLTesterRandomScores.main(ValidPMMLTesterRandomScores.java:160)


回答1:


According to the formal definition of the PMML built-in function "substring", it requires a string argument and two integer arguments. The SAS EM generated PMML code attempts to invoke this function with a string argument, an integer argument, and another string argument substring($AnyCInput, 1, "FMTWIDTH").

This PMML fragment can be fixed by accessing the value of the "FMTWIDTH" parameter using the FieldRef element:

<Apply function="substring">
  <FieldRef field="AnyCInput"/>
  <Constant>1</Constant>
  <FieldRef field="FMTWIDTH"/>
</Apply>

In conclusion, JPMML is a correct and SAS EM is wrong.




回答2:


Invalid PMML documents can be corrected on the fly by rearranging the PMML class model object. The Visitor API of the JPMML-Model library is designed exactly for this purpose:

PMML pmml = loadSasEmPMML()

Visitor invalidSubstringCorrector = new AbstractVisitor(){

    @Override
    public VisitorAction visit(Apply apply){
        if(isInvalidSubstring(apply)){
            List<Expression> expressions = apply.getExpressions();

            expressions.set(2, new FieldRef(new FieldName("FMTWIDTH")));
        }
        return super.visit(apply);
    }

    private boolean isInvalidSubstring(Apply apply){
        if(("substring").equals(apply.getFunction())){
            List<Expression> expressions = apply.getExpressions();

            Expression lengthArgument = expressions.get(2);
            if(lengthArgument instanceof Constant){
                Constant constant = (Constant)lengthArgument;
                return ("FMTWIDTH").equals(constant.getValue());
            }
        }
        return false;
    }
};

invalidSubstringCorrector.applyTo(pmml);

Currently, the method isInvalidSubstring(Apply) identifies problematic Apply elements by checking only if the third expression element is a String constant "FMTWIDTH". If one needs to be extra sure, then perhaps it would be a good idea to add proper assertions about the first and second expression element as well.



来源:https://stackoverflow.com/questions/33003359/fmtwidth-error-while-using-jpmml-to-evaluate-a-sas-produced-pmml-file

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!