How to get the probability per instance in classifications models in spark.mllib

后端 未结 1 1042
清酒与你
清酒与你 2020-12-12 03:03

I\'m using spark.mllib.classification.{LogisticRegressionModel, LogisticRegressionWithSGD} and spark.mllib.tree.RandomForest for classification. Using these packages I produ

相关标签:
1条回答
  • 2020-12-12 04:03

    Unfortunately, with MLLIb you can't get the probabilities per instance for classification models till version 1.4.1.

    There is JIRA issues (SPARK-4362 and SPARK-6885) concerning this exact topic which is IN PROGRESS as I'm writing the answer now. Nevertheless, the issue seems to be on hold since November 2014

    There is currently no way to get the posterior probability of a prediction with Naive Baye's model during prediction. This should be made available along with the label.

    And here is a note from @sean-owen on the mailing list on a similar topic regarding the Naive Bayes classification algorithm:

    This was recently discussed on this mailing list. You can't get the probabilities out directly now, but you can hack a bit to get the internal data structures of NaiveBayesModel and compute it from there.

    Reference : source.

    MAJOR EDIT: This issue has been resolved with Spark 1.5.0. Please refer to the JIRA issue for more details.

    0 讨论(0)
提交回复
热议问题