Cost function always returning zero for a binary classification in tensorflow

核能气质少年 提交于 2019-12-01 20:49:59

The only problem is invalid cost used. softmax_cross_entropy_with_logits should be used if you have more than two classes, as softmax of a single output always returns 1, as it is defined as :

softmax(x)_i = exp(x_i) / SUM_j exp(x_j)

so for a single number (one dimensional output)

softmax(x) = exp(x) / exp(x) = 1

Furthermore, for softmax output TF expects one-hot encoded labels, so if you provide only 0 or 1, there are two possibilities:

  1. True label is 0, so the cost is -0*log(1) = 0
  2. True label is 1, so the cost is -1*log(1) = 0

Tensorflow has a separate function to handle binary classification which applies sigmoid instead (note, that the same function for more than one output would apply sigmoid independently on each dimension which is what multi-label classification would expect):

tf.sigmoid_cross_entropy_with_logits

just switch to this cost and you are good to go, you do not have to encode anything as one-hot anymore either, as this function is designed solely to be used for your use-case.

The only missing bit is that .... your code does not have actual training routine you need to define optimiser, ask it to minimise a loss and then run a train op in the loop. In your current setting you just try to predict over and over, with the network which never changes.

In particular, please refer to Cross Entropy Jungle question on SO which provides more detailed description of all these different helper functions in TF (and other libraries), which have different requirements/use cases.

The softmax_cross_entropy_with_logits is basically a stable implementation of the 2 parts :

softmax = tf.nn.softmax(prediction)
cost = -tf.reduce_mean(labels * tf.log(softmax), 1)

Now in your example, prediction is a single value, so when you apply softmax on it, its going to be always 1 irrespective of the value (exp(prediction)/exp(prediction) = 1), and so the tf.log(softmax) term becomes 0. Thats why you always get your cost zero.

Either apply sigmoid to get your probabilities between 0 or 1 or if you use want to use softmax get the labels as [1, 0] for class 0 and [0, 1] for class 1.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!