tensorflow deep neural network for regression always predict same results in one batch

前端未结

关注

 2  1885

庸人自扰 2020-12-04 10:35

I use a tensorflow to implement a simple multi-layer perceptron for regression. The code is modified from standard mnist classifier, that I only changed the output cost to M

2条回答

执念已碎 (楼主)

2020-12-04 10:48
Short answer:

Transpose your pred vector using tf.transpose(pred).

Longer answer:

The problem is that pred (the predictions) and y (the labels) are not of the same shape: one is a row vector and the other a column vector. Apparently when you apply an element-wise operation on them, you'll get a matrix, which is not what you want.

The solution is to transpose the prediction vector using tf.transpose() to get a proper vector and thus a proper loss function. Actually, if you set the batch size to 1 in your example you'll see that it works even without the fix, because transposing a 1x1 vector is a no-op.

I applied this fix to your example code and observed the following behaviour. Before the fix:
```
Epoch: 0245 cost= 84.743440580
[*]----------------------------
label value: 23 estimated value: [ 27.47437096]
label value: 50 estimated value: [ 24.71126747]
label value: 22 estimated value: [ 23.87785912]
```
And after the fix at the same point in time:
```
Epoch: 0245 cost= 4.181439120
[*]----------------------------
label value: 23 estimated value: [ 21.64333534]
label value: 50 estimated value: [ 48.76105118]
label value: 22 estimated value: [ 24.27996063]
```
You'll see that the cost is much lower and that it actually learned the value 50 properly. You'll have to do some fine-tuning on the learning rate and such to improve your results of course.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...