why not use the max value of output tensor instead of Softmax Function?

一笑奈何 提交于 2020-01-15 11:04:33

问题


I built a CNN model on images one-class classification. The output tensor is a list which has 65 elements. I make this tensor input to Softmax Function, and got the classified result. I think the max value in this output tensor is the classified result, why not use this way to do classification task? Just the Softmax Function can be taken the derivative easily?


回答1:


Softmax is used for multi-class classification. In multi-class class classification the model is expected to classify the input to single class with higher probability. Predicting with high probability enforces probabilities for other classes to be low.

As you stated one of the reason why one uses Softmax over max function is the softmax function is diffrential over Real Numbers and max function is not.

Edit:

There are some other properties of softmax function that makes it suitable to use for neural networks compared to max. Firstly it is soft version of max function. Let's say the logits of neural network has 4 outputs of [0.5, 0.5, 0.69, 0.7]. Hard max returns 1 for maximum index(in this case for 4th index) and 0 for other indexes. This results information loss. Second important property of softmax is the output of sofmax function are in interval [0,1] and the sum of these values is equal to 1. For this reason the output of softmax function can be interpreted as probability. This means output can be considered as the confidence of the model to classify inputs to one of each output classes.



来源:https://stackoverflow.com/questions/50986957/why-not-use-the-max-value-of-output-tensor-instead-of-softmax-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!