可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
Just trying to do a simple linear regression but I'm baffled by this error for:
regr = LinearRegression() regr.fit(df2.iloc[1:1000, 5].values, df2.iloc[1:1000, 2].values)
which produces:
ValueError: Found arrays with inconsistent numbers of samples: [ 1 999]
These selections must have the same dimensions, and they should be numpy arrays, so what am I missing?
回答1:
It looks like sklearn requires the data shape of (row number, column number). If your data shape is (row number, ) like (999, )
, it does not work. By using numpy.reshape()
, you should change to (999, 1)
, e.g. using
data.reshape((999,1))
In my case, it worked with that.
回答2:
Looks like you are using pandas dataframe (from the name df2).
You could also do the following:
regr = LinearRegression() regr.fit(df2.iloc[1:1000, 5].to_frame(), df2.iloc[1:1000, 2].to_frame())
NOTE: I have removed "values" as that converts the pandas Series to numpy.ndarray and numpy.ndarray does not have attribute to_frame().
回答3:
I think the "X" argument of regr.fit needs to be a matrix, so the following should work.
regr = LinearRegression() regr.fit(df2.iloc[1:1000, [5]].values, df2.iloc[1:1000, 2].values)
回答4:
I encountered this error because I converted my data to an np.array
. I fixed the problem by converting my data to an np.matrix
instead and taking the transpose.
ValueError: regr.fit(np.array(x_list), np.array(y_list))
Correct: regr.fit(np.transpose(np.matrix(x_list)), np.transpose(np.matrix(y_list)))
回答5:
expects X(feature matrix)
Try to put your features in a tuple like this:
features = ['TV', 'Radio', 'Newspaper'] X = data[features]
回答6:
Seen on the Udacity deep learning foundation course:
df = pd.read_csv('my.csv') ... regr = LinearRegression() regr.fit(df[['column x']], df[['column y']])
回答7:
As it was mentioned above X argument must be a matrix or a numpy array with known dimensions. So you could probably use this:
df2.iloc[1:1000, 5:some_last_index].values
So your dataframe would be converted to an array with known dimensions and you won't need to reshape it
回答8:
To analyze two arrays (array1 and array2) they need to meet the following two requirements:
1) They need to be a numpy.ndarray
Check with
type(array1) # and type(array2)
If that is not the case for at least one of them perform
array1 = numpy.ndarray(array1) # or array2 = numpy.ndarray(array2)
2) The dimensions need to be as follows:
array1.shape #shall give (N, 1) array2.shape #shall give (N,)
N is the number of items that are in the array. To provide array1 with the right number of axes perform:
array1 = array1[:, numpy.newaxis]