Implementation of a softmax activation function for neural networks
问题 I am using a Softmax activation function in the last layer of a neural network. But I have problems with a safe implementation of this function. A naive implementation would be this one: Vector y = mlp(x); // output of the neural network without softmax activation function for(int f = 0; f < y.rows(); f++) y(f) = exp(y(f)); y /= y.sum(); This does not work very well for > 100 hidden nodes because the y will be NaN in many cases (if y(f) > 709, exp(y(f)) will return inf). I came up with this