Why do I get [nan] when using TensorFlow to calculate a simple linear regression?

问题

When I use TensorFlow to calculate a simple linear regression I get [nan], including: w, b and loss.

Here is my code:

import tensorflow as tf

w = tf.Variable(tf.zeros([1]), tf.float32)
b = tf.Variable(tf.zeros([1]), tf.float32)
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)

liner = w*x+b

loss = tf.reduce_sum(tf.square(liner-y))

train = tf.train.GradientDescentOptimizer(1).minimize(loss)

sess = tf.Session()

x_data = [1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000]
y_data = [265000, 324000, 340000, 412000, 436000, 490000, 574000, 585000, 680000]                                                    

sess.run(tf.global_variables_initializer())

for i in range(1000):
    sess.run(train, {x: x_data, y: y_data})

nw, nb, nloss = sess.run([w, b, loss], {x: x_data, y: y_data})

print(nw, nb, nloss)

Output:

[ nan] [ nan] nan

Process finished with exit code 0

Why does this happen, and how can I fix it?

回答1:

You are overflowing by using such a high learning rate (1 in your case). Try with a learning rate of 0.001. Also your data needs to be divided by 1000 and the number of iterations increased and it should work. This is the code I tested and works perfectly.

x_data = [1, 2, 3, 4, 5, 6, 7, 8, 9]
y_data = [265, 324, 340, 412, 436, 490, 574, 585, 680]

plt.plot(x_data, y_data, 'ro', label='Original data')
plt.legend()
plt.show()

W = tf.Variable(tf.random_uniform([1], 0, 1))
b = tf.Variable(tf.zeros([1]))
y = W * x_data + b

loss = tf.reduce_mean(tf.square(y - y_data))

optimizer = tf.train.GradientDescentOptimizer(0.001)
train = optimizer.minimize(loss)
init = tf.initialize_all_variables()

sess = tf.Session()
sess.run(init)

for step in range(0,50000):
   sess.run(train)
   print(step, sess.run(loss))
print (step, sess.run(W), sess.run(b))

plt.plot(x_data, y_data, 'ro')
plt.plot(x_data, sess.run(W) * x_data + sess.run(b))
plt.legend()
plt.show()

回答2:

This gives the explanation I believe:

for i in range(10):
     print(sess.run([train, w, b, loss], {x: x_data, y: y_data}))

Gives the following result:

[None, array([  4.70380012e+10], dtype=float32), array([ 8212000.], dtype=float32), 2.0248419e+12] 
[None, array([ -2.68116614e+19], dtype=float32), array([ -4.23342041e+15], dtype=float32),
6.3058345e+29] 
[None, array([  1.52826476e+28], dtype=float32), array([  2.41304958e+24], dtype=float32), inf] [None, array([
-8.71110858e+36], dtype=float32), array([ -1.37543819e+33], dtype=float32), inf] 
[None, array([ inf], dtype=float32), array([ inf], dtype=float32), inf]

Your learning rate is simply too big, so you "overcorrect" the value of w at each iteration (see as it oscillates between negative and positive, with increasing absolute value). You get higher and higher values, until something reaches infinity, which creates Nan values. Just lower (a lot) the learning rate.

来源：https://stackoverflow.com/questions/47928554/why-do-i-get-nan-when-using-tensorflow-to-calculate-a-simple-linear-regression

标签

python

tensorflow

machine-learning

linear-regression