CS231n: How to calculate gradient for Softmax loss function?

人盡茶涼 提交于 2019-12-02 18:09:15
Ben Barsdell

Not sure if this helps, but:

is really the indicator function

, as described here. This forms the expression (j == y[i]) in the code.

Also, the gradient of the loss with respect to the weights is:

where

which is the origin of the X[:,i] in the code.

Jawher.B

I know this is late but here's my answer:

I'm assuming you are familiar with the cs231n Softmax loss function. We know that:

So just as we did with the SVM loss function the gradients are as follows:

Hope that helped.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!