Missing intercepts of OLS Regression models in Python statsmodels

前端未结

关注

 2  1383

I am running a rolling for example of 100 window OLS regression estimation of the dataset found in this link (https://drive.google.com/drive/folders/0B2Iv8dfU4f

相关标签:

2条回答

闹比i

2020-12-12 06:22

Short Answer

The value of r^2 is going to be +/- inf as long as y remains constant over the regression window (100 observations in your case). You can find more details below, but intuition is that r^2 is the proportion of y's variance explained by X: if y's variance is zero, r^2 is simply not well defined.

Possible solution: Try to use a longer window, or resample Y and X so that Y does not remain constant for so many consecutive observations.

Long Answer

Looking at this I honestly think this is not the right dataset for the regression. This is a simple plot of the data:

Does a linear combination of X and time explain Y? Mmm...doesn't look plausible. Y almost looks like a discrete variable, so you probably want to look at logistic regressions.

To come to your question, the R^2 is the "the proportion of the variance in the dependent variable that is predictable from the independent variable(s)". From wikipedia:

In your case it is very likely that Y is constant over 100 observations, hence it has 0 variance, that produces a division by zero hence the inf.

So I am afraid you should not look to fixes in the code, but you should rethink the problem and the way of fitting the data.

0 讨论(0)
发布评论:

提交评论
- 加载中...
佛祖请我去吃肉

2020-12-12 06:31
Ok so I prepared this small example so you can visualize what a Poisson regression could do.
```
import statsmodels as sm
import matplotlib.pyplot as plt
poi_model = sm.discrete.discrete_model.Poisson

x = np.random.uniform(0, 20,1000)
s = np.random.poisson( x*(0.5) , 1000)
plt.bar(x,s)
plt.show()
```
This generates random poisson counts.

Now the way to fit a poisson regression to the data is the following:
```
my_model = poi_model(endog=s, exog=x)
my_model = my_model.fit()
my_model.summary()
```
The summary displays a number of statistics but if you want to compute the mean square error you could do that like so:
```
preds = my_model.predict()
mse = np.mean(np.square(preds - s))
```
If you want to predict new values do the following:
```
my_model.predict(exog=new_value)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...