tensorflow deep neural network for regression always predict same results in one batch

前端 未结 2 1885
庸人自扰
庸人自扰 2020-12-04 10:35

I use a tensorflow to implement a simple multi-layer perceptron for regression. The code is modified from standard mnist classifier, that I only changed the output cost to M

2条回答
  •  执念已碎
    2020-12-04 10:48

    Short answer:

    Transpose your pred vector using tf.transpose(pred).

    Longer answer:

    The problem is that pred (the predictions) and y (the labels) are not of the same shape: one is a row vector and the other a column vector. Apparently when you apply an element-wise operation on them, you'll get a matrix, which is not what you want.

    The solution is to transpose the prediction vector using tf.transpose() to get a proper vector and thus a proper loss function. Actually, if you set the batch size to 1 in your example you'll see that it works even without the fix, because transposing a 1x1 vector is a no-op.

    I applied this fix to your example code and observed the following behaviour. Before the fix:

    Epoch: 0245 cost= 84.743440580
    [*]----------------------------
    label value: 23 estimated value: [ 27.47437096]
    label value: 50 estimated value: [ 24.71126747]
    label value: 22 estimated value: [ 23.87785912]
    

    And after the fix at the same point in time:

    Epoch: 0245 cost= 4.181439120
    [*]----------------------------
    label value: 23 estimated value: [ 21.64333534]
    label value: 50 estimated value: [ 48.76105118]
    label value: 22 estimated value: [ 24.27996063]
    

    You'll see that the cost is much lower and that it actually learned the value 50 properly. You'll have to do some fine-tuning on the learning rate and such to improve your results of course.

提交回复
热议问题