发表新帖

发表新帖

Implementation of a softmax activation function for neural networks

前端未结

关注

 2  1706

失恋的感觉

I am using a Softmax activation function in the last layer of a neural network. But I have problems with a safe implementation of this function.

A naive implementati

相关标签:

2条回答

天命终不由人

2020-12-23 15:24
I know it's already answered but I'll post here a step-by-step anyway.

put on log:
```
zj = wj . x + bj
oj = exp(zj)/sum_i{ exp(zi) }
log oj = zj - log sum_i{ exp(zi) }
```
Let m be the max_i { zi } use the log-sum-exp trick:
```
log oj = zj - log {sum_i { exp(zi + m - m)}}
   = zj - log {sum_i { exp(m) exp(zi - m) }},
   = zj - log {exp(m) sum_i {exp(zi - m)}}
   = zj - m - log {sum_i { exp(zi - m)}}
```
the term exp(zi-m) can suffer underflow if m is much greater than other z_i, but that's ok since this means z_i is irrelevant on the softmax output after normalization. final results is:
```
oj = exp (zj - m - log{sum_i{exp(zi-m)}})
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
抹茶落季

2020-12-23 15:40

First go to log scale, i.e calculate log(y) instead of y. The log of the numerator is trivial. In order to calculate the log of the denominator, you can use the following 'trick': http://lingpipe-blog.com/2009/06/25/log-sum-of-exponentials/

0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题