Using squared difference of two images as loss function in tensorflow

那年仲夏 提交于 2019-12-23 15:43:33

问题


I'm trying to use the SSD between two images as loss function for my network.

# h_fc2 is my output layer, y_ is my label image.
ssd = tf.reduce_sum(tf.square(y_ - h_fc2))

train_step = tf.train.GradientDescentOptimizer(0.01).minimize(ssd)

Problem is, that the weights then diverge and I get the error

 ReluGrad input is not finite. : Tensor had Inf values

Why's that? I did try some other stuff like normalizing the ssd by the image size (did not work) or cropping the output values to 1 (does not crash anymore, but I still need to evaluate this):

ssd_min_1 = tf.reduce_sum(tf.square(y_ - tf.minimum(h_fc2, 1)))
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(ssd_min_1)

Are my observations to be expected?

Edit: @mdaoust suggestions proved to be correct. The main point was normalizing by batch size. This can be done independent of batch size by using this code

squared_diff_image = tf.square(label_image - output_img)
# Sum over all dimensions except the first (the batch-dimension).
ssd_images = tf.reduce_sum(squared_diff_image, [1, 2, 3])
# Take mean ssd over batch.
error_images = tf.reduce_mean(ssd_images)

With this change, only a slight decrease of the learning rate (to 0.0001) was necessary.


回答1:


There are a lot of ways you can end up with non-finite results.

But optimizers, especially simple ones like gradient descent, can diverge if the learning rate is 'too high'.

Have you tried simply dividing your learning rate by 10/100/1000? Or normalizing by pixels*batch_size to get the average error per pixel?

Or one of the more advanced optimizers? For example tf.train.AdamOptimizer() with default options.



来源:https://stackoverflow.com/questions/33753251/using-squared-difference-of-two-images-as-loss-function-in-tensorflow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!