How to get the probability per instance in classifications models in spark.mllib

你说的曾经没有我的故事 提交于 2019-11-28 14:30:57

Unfortunately, with MLLIb you can't get the probabilities per instance for classification models till version 1.4.1.

There is JIRA issues (SPARK-4362 and SPARK-6885) concerning this exact topic which is IN PROGRESS as I'm writing the answer now. Nevertheless, the issue seems to be on hold since November 2014

There is currently no way to get the posterior probability of a prediction with Naive Baye's model during prediction. This should be made available along with the label.

And here is a note from @sean-owen on the mailing list on a similar topic regarding the Naive Bayes classification algorithm:

This was recently discussed on this mailing list. You can't get the probabilities out directly now, but you can hack a bit to get the internal data structures of NaiveBayesModel and compute it from there.

Reference : source.

MAJOR EDIT: This issue has been resolved with Spark 1.5.0. Please refer to the JIRA issue for more details.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!