Why use softmax only in the output layer and not in hidden layers?

前端 未结 5 841
情歌与酒
情歌与酒 2020-12-29 10:00

Most examples of neural networks for classification tasks I\'ve seen use the a softmax layer as output activation function. Normally, the other hidden units use a sigmoid, t

5条回答
  •  天命终不由人
    2020-12-29 10:16

    Softmax function is one of the most important output function used in deep learning within the neural networks (see Understanding Softmax in minute by Uniqtech). The Softmax function is apply where there are three or more classes of outcomes. The softmax formula takes the e raised to the exponent score of each value score and devide it by the sum of e raised the exponent scores values. For example, if I know the Logit scores of these four classes to be: [3.00, 2.0, 1.00, 0.10], in order to obtain the probabilities outputs, the softmax function can be apply as follows:

    1. import numpy as np

    2. def softmax(x):

    3. z = np.exp(x - np.max(x))
    4. return z / z.sum()
    5. scores = [3.00, 2.0, 1.00, 0.10]
    6. print(softmax(scores))

    7. Output: probabilities (p) = 0.642 0.236 0.087 0.035

    The sum of all probabilities (p) = 0.642 + 0.236 + 0.087 + 0.035 = 1.00. You can try to substitute any value you know in the above scores, and you will get a different values. The sum of all the values or probabilities will be equal to one. That’s makes sense, because the sum of all probability is equal to one, thereby turning Logit scores to probability scores, so that we can predict better. Finally, the softmax output, can help us to understand and interpret Multinomial Logit Model. If you like the thoughts, please leave your comments below.

提交回复
热议问题