问题
I want to use the statsmodels.regression.linear_model.OLS package to do a prediction, but with a specified constant.
Currently, I can specify the presence of a constant with an argument:
(from docs: http://statsmodels.sourceforge.net/devel/generated/statsmodels.regression.linear_model.OLS.html)
class statsmodels.regression.linear_model.OLS(endog, exog=None, missing='none', hasconst=None), where hasconst is a boolean.
What I want to do is specify explicitly a constant C, and then fit a linear regression model around it. From using that OLS, I want to generate a and then access all the attributes like resid, etc.
A current suboptimal work around would be to specify the OLS without a constant, subtract the constant from the Y-values, and create a custom object that wraps both the specified constant and OLS w/o constant, every time I want to do predict or fit, to first subtract the constant from the Y variables, and then use the prediction.
Thanks!
回答1:
If you use the formula
API for statsmodels, you can specify a constant intercept more concisely as part of a Patsy design matrix specification. This is still a bit hacky--it's basically just a cleaner way of expressing your proposed solution--but at least it's shorter. E.g.:
>>> import statsmodels.formula.api as smf
>>> import pandas as pd
>>> import numpy as np
>>> c = 3.1416
>>> df = pd.DataFrame(np.random.rand(10, 2), columns=['x', 'y'])
>>> ols = smf.ols('y - c ~ 0 + x', data=df)
>>> result = ols.fit()
>>> print result.summary()
...
==============================================================================
coef std err t P>|t| [95.0% Conf. Int.]
------------------------------------------------------------------------------
x 0.7404 0.230 3.220 0.010 0.220 1.261
==============================================================================
As you can see, there's no coefficient for the intercept, and the best slope for x
is not 1.
来源:https://stackoverflow.com/questions/26534800/specifying-a-constant-in-statsmodels-linear-regression