问题
I am implementing a program that performs linear regression on the following dataset:
http://www.rossmanchance.com/iscam2/data/housing.txt
My program is as follows:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
def abline(X,theta,Y):
yValues=calcH(X,theta)
plt.xlim(0, 5000)
plt.ylim(0, 2000000)
plt.xlabel("sqft")
plt.ylabel("price")
plt.gca().set_aspect(0.001, adjustable='box')
plt.plot(X,Y,'.',X, yValues, '-')
plt.show()
def openFile(fileR):
f=pd.read_csv(fileR,sep="\t")
header=f.columns.values
prediction=f["price"]
X=f["sqft"]
gradientDescent(0.0005,100,prediction,X)
def calcH(X,theta):
h=np.dot(X,theta)
return h
def calcC(X,Y,theta):
d=((calcH(X,theta)-Y)**2).mean()/2
return d
def gradientDescent(learningRate,itera, Y, X):
t0=[]
t1=[]
cost=[]
theta=np.zeros(2)
X=np.column_stack((np.ones(len(X)),X))
for i in range(itera):
h_theta=calcH(X,theta)
theta0=theta[0]-learningRate*(Y-h_theta).mean()
theta1=theta[1]-learningRate*((Y-h_theta)*X[:,1]).mean()
theta=np.array([theta0,theta1])
j=calcC(X,Y,theta)
t0.append(theta0)
t1.append(theta1)
cost.append(j)
if (i%10==0):
print ("iteration ",i,"cost ",j,"theta ",theta)
abline(X,theta,Y)
The problem that I have is that when I got my results the values of theta ends up to Inf. I have tested with only 3 iterations and some values are as follows:
iteration 0 cost 9.948977633931098e+21 theta [-2.47365759e+04 -6.10382173e+07]
iteration 1 cost 7.094545903263138e+32 theta [-6.46495395e+09 -1.62995849e+13]
iteration 2 cost 5.059070733255204e+43 theta [-1.72638812e+15 -4.35260862e+18]
I would like to predict the price based on the variable sqft. I am basically following the formulas given by Andrew Ng in its Coursera ML course:
By deriving the term I got the update rule:
Update: I have added a function to plot my data and, strange, I got the following plots which are not correct:
Because it seems that my predictions are going up.
but when I plot the relationship is clearly lineal:
What am I doing wrong?
Thanks
回答1:
I replicated your results. Besides some stylistic issues and the reversing of (Y-h_theta)
and (h_theta - Y)
(as pointed out in one of the comments), the actual code is correct. It's just that the numbers are massive and it easily causes the results to overdo the gradient every iteration and oscillate between extremes, each time trying to "counteract" the last step with an even bigger step to the other direction. A very low learning rate could work. In real world applications, you could also normalize your data to address some of these issues.
来源:https://stackoverflow.com/questions/57740947/implementation-of-linear-regression-values-of-weights-increases-to-inf