Why use log-probability estimates in GaussianNB [scikit-learn]?

问题

I'm currently using scikit-learn's GaussianNB package.

I've noticed that I can choose to return results for the classification several different ways. One way to return a classification is using the predict_log_proba method.

Why would I choose to use predict_log_proba versus predict_proba versus predict?

回答1:

predict just gives you the class for every example
predict_proba gives you the probability for every class, and predict is just taking the class which maximal probability
predict_log_proba gives you the logarithm of the probabilities, this is often handier as probabilities can become very, very small

回答2:

When computing with probabilities, it's quite common to do so in log-space instead of in linear space because probabilities often need to be multiplied, causing them to become very small and subject to rounding errors. Also, some quantities like KL divergence are either defined or easily computed in terms of log-probabilities (note that log(P/Q) = log(P) - log(Q)).

Finally, Naive Bayes classifiers typically work in logspace themselves for reasons of stability and speed, so first computing exp(logP) only to get logP back later is wasteful.

来源：https://stackoverflow.com/questions/20335944/why-use-log-probability-estimates-in-gaussiannb-scikit-learn

标签

scikit-learn

gaussian

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!