Spark 2.1.0, ML RandomForest: java.lang.UnsupportedOperationException: empty.maxBy

自古美人都是妖i 提交于 2019-12-11 09:05:14

问题


I am trying to fit an ML Cross Validator on a DataFrame of the following Schema:

root
 |-- userID: string (nullable = true)
 |-- features: vector (nullable = true)
 |-- label: double (nullable = true)

I am getting a java.lang.UnsupportedOperationException: empty.maxBy when I fit the CrossValidator.

I have read this bug report, it says that this exception happens there is no feautres:

In the case of empty features we fail with a better error message stating: DecisionTree requires number of features > 0, but was given an empty features vector Instead of the cryptic error message: java.lang.UnsupportedOperationException: empty.max

In my case, I do have thousands of features, so I am sure that the features DataFrame is not empty.

What could be another reason for this exception?

I am running the cluster on EMR, and here is the code if that helps (the DataFrame name is featuresDF, and before I fit the CrossValidator I verified that there is no empty features):

val rf = new RandomForestClassifier()
                        .setLabelCol("label")
                        .setFeaturesCol("features")

val pipeline = new Pipeline().setStages(Array(rf))

val paramGrid = new ParamGridBuilder()
                    .addGrid(rf.numTrees, Array(500, 1000))
                    .addGrid(rf.maxDepth, Array(15, 25))
                    .build()

val evaluator = new BinaryClassificationEvaluator()
                    .setLabelCol("label")
                    .setMetricName("areaUnderPR")

val cv = new CrossValidator()
                    .setEstimator(pipeline)
                    .setEvaluator(evaluator)
                    .setEstimatorParamMaps(paramGrid)
                    .setNumFolds(3)

val model = cv.fit(featuresDF)

来源:https://stackoverflow.com/questions/44024076/spark-2-1-0-ml-randomforest-java-lang-unsupportedoperationexception-empty-max

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!