How to get the prediction of test from 2D parameters of WLS regression in statsmodels

拜拜、爱过 提交于 2019-12-04 05:16:06

问题


I'm incrementally up the parameters of WLS regression functions using statsmodels.

I have a 10x3 dataset X that I declared like this:

X = np.array([[1,2,3],[1,2,3],[4,5,6],[1,2,3],[4,5,6],[1,2,3],[1,2,3],[4,5,6],[4,5,6],[1,2,3]])

This is my dataset, and I have a 10x2 endog vector that looks like this:

z =
[[  3.90311860e-322   2.00000000e+000]
 [  0.00000000e+000   2.00000000e+000]
 [  0.00000000e+000  -2.00000000e+000]
 [  0.00000000e+000   2.00000000e+000]
 [  0.00000000e+000  -2.00000000e+000]
 [  0.00000000e+000   2.00000000e+000]
 [  0.00000000e+000   2.00000000e+000]
 [  0.00000000e+000  -2.00000000e+000]
 [  0.00000000e+000  -2.00000000e+000]
 [  0.00000000e+000   2.00000000e+000]]

Now after importing import statsmodels.api as sm I do this:

g = np.zeros([3, 2]) # g(x) is a function that will store the regression parameters
mod_wls = sm.WLS(z, X)
temp_g = mod_wls.fit()
print temp_g.params

And I get this output:

[[ -5.92878775e-323  -2.77777778e+000]
 [ -4.94065646e-324  -4.44444444e-001]
 [  4.94065646e-323   1.88888889e+000]]

Earlier, from the answer to this question, I was able to predict the value of test data X_test using numpy.dot, like this:

np.dot(X_test, temp_g.params)

I understood that easily since it the endog vector, y was a 1D array. But how does it work when my endog vector, in this case, z, is 2D? When I try the above line as was used in the 1D version, I get the following error:

   self._check_integrity()
  File "C:\Users\app\Anaconda\lib\site-packages\statsmodels\base\data.py", line 247, in _check_integrity
    raise ValueError("endog and exog matrices are different sizes")
ValueError: endog and exog matrices are different sizes

回答1:


np.dot(X_test, temp_g.params) should still work.

In some cases you need to check what the orientation of the matrices are, sometimes it's necessary to transpose

However predict and most other methods of the results will not work, because the model assumes that dependent variable, z, is 1D.

The question is again what you are trying to do?

If you want to independently fit columns of z, then iterate over it so each y is 1D.

for y in z.T: res = WLS(y, X).fit()

z.T allows iteration over columns.

In other cases, we usually stack the model so that y is 1D and first part of it is z[:,0] and the second part of the column is z[:,1]. The design matrix or matrix of explanatory variables has to be expanded correspondingly.

Support for multivariate dependent variables is in the making for statsmodels but will still take some time to be ready.



来源:https://stackoverflow.com/questions/23369859/how-to-get-the-prediction-of-test-from-2d-parameters-of-wls-regression-in-statsm

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!