Why not to use mean square error for classification problem

半城伤御伤魂 提交于 2019-12-11 17:45:49

问题


I am trying to implement a simple binary classification problem using RNN LSTM and still not available to figure out the correct loss function for the network. The issue is, when I use the cross_binary_entophy as loss function, the loss value for training and testing is relatively high as compared to using a mean_square_error function.

Upon research, I came across to justifications that binary cross entropy should be used for classification problem and MSE for the regression problem. However, in my case, I am getting better accuracies and lesser loss value with MSE for binary classification.

I am not sure how to justify these obtained results. Completely new to AI and ML techniques.


回答1:


Like to share my understanding about MSE and cross_binary_entrophy.

In case of classification, we take the argmax() of probability of each training instance.

Now consider an example of binary classifier where model predicts the probability as (.49, .51). In this case model will return "1" as prediction.

Assume if actual label is also "1".

In such case if MSE is used it will return 0 as a loss value, whereas cross_binary_entrophy will return some tangible value. And if somehow with all data sample, trained model predicts similar type of probability, then cross_binary_entrophy effectively return a big accumulative loss value, whereas MSE will return a 0.

 According to MSE, its a perfect model, but in actuality its not a that good model, that's why we should not use MSE for classification. 



回答2:


I would like to show it using an example. Assume a 6 class classification problem.

Assume, True probabilities = [1, 0, 0, 0, 0, 0]

Case 1: Predicted probabilities = [0.2, 0.16, 0.16, 0.16, 0.16, 0.16]

Case 2: Predicted probabilities = [0.4, 0.5, 0.1, 0, 0, 0]

The MSE in the Case1 and Case 2 is 0.128 and 0.1033 respectively.

Although, Case 1 is correctly predicting class 1 for the instance, the loss in Case 1 is higher than the loss in Case 2.



来源:https://stackoverflow.com/questions/56013688/why-not-to-use-mean-square-error-for-classification-problem

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!