Why do I get [nan] when using TensorFlow to calculate a simple linear regression?

只愿长相守 提交于 2019-12-12 10:26:52

问题


When I use TensorFlow to calculate a simple linear regression I get [nan], including: w, b and loss.

Here is my code:

import tensorflow as tf

w = tf.Variable(tf.zeros([1]), tf.float32)
b = tf.Variable(tf.zeros([1]), tf.float32)
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)

liner = w*x+b

loss = tf.reduce_sum(tf.square(liner-y))

train = tf.train.GradientDescentOptimizer(1).minimize(loss)

sess = tf.Session()

x_data = [1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000]
y_data = [265000, 324000, 340000, 412000, 436000, 490000, 574000, 585000, 680000]                                                    

sess.run(tf.global_variables_initializer())

for i in range(1000):
    sess.run(train, {x: x_data, y: y_data})

nw, nb, nloss = sess.run([w, b, loss], {x: x_data, y: y_data})

print(nw, nb, nloss)

Output:

[ nan] [ nan] nan

Process finished with exit code 0

Why does this happen, and how can I fix it?


回答1:


You are overflowing by using such a high learning rate (1 in your case). Try with a learning rate of 0.001. Also your data needs to be divided by 1000 and the number of iterations increased and it should work. This is the code I tested and works perfectly.

x_data = [1, 2, 3, 4, 5, 6, 7, 8, 9]
y_data = [265, 324, 340, 412, 436, 490, 574, 585, 680]

plt.plot(x_data, y_data, 'ro', label='Original data')
plt.legend()
plt.show()

W = tf.Variable(tf.random_uniform([1], 0, 1))
b = tf.Variable(tf.zeros([1]))
y = W * x_data + b

loss = tf.reduce_mean(tf.square(y - y_data))

optimizer = tf.train.GradientDescentOptimizer(0.001)
train = optimizer.minimize(loss)
init = tf.initialize_all_variables()

sess = tf.Session()
sess.run(init)

for step in range(0,50000):
   sess.run(train)
   print(step, sess.run(loss))
print (step, sess.run(W), sess.run(b))

plt.plot(x_data, y_data, 'ro')
plt.plot(x_data, sess.run(W) * x_data + sess.run(b))
plt.legend()
plt.show()



回答2:


This gives the explanation I believe:

for i in range(10):
     print(sess.run([train, w, b, loss], {x: x_data, y: y_data}))

Gives the following result:

[None, array([  4.70380012e+10], dtype=float32), array([ 8212000.], dtype=float32), 2.0248419e+12] 
[None, array([ -2.68116614e+19], dtype=float32), array([ -4.23342041e+15], dtype=float32),
6.3058345e+29] 
[None, array([  1.52826476e+28], dtype=float32), array([  2.41304958e+24], dtype=float32), inf] [None, array([
-8.71110858e+36], dtype=float32), array([ -1.37543819e+33], dtype=float32), inf] 
[None, array([ inf], dtype=float32), array([ inf], dtype=float32), inf]

Your learning rate is simply too big, so you "overcorrect" the value of w at each iteration (see as it oscillates between negative and positive, with increasing absolute value). You get higher and higher values, until something reaches infinity, which creates Nan values. Just lower (a lot) the learning rate.



来源:https://stackoverflow.com/questions/47928554/why-do-i-get-nan-when-using-tensorflow-to-calculate-a-simple-linear-regression

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!