[Statsmodels]: How can I get statsmodel to return the pvalue of an OLS object?

怎甘沉沦 提交于 2019-12-04 20:12:53

Ran into the same problem.

You can access the p-values through

regressor_OLS.pvalues 

They're stored as an array of float64s in scientific notation. I'm a bit new to python and I'm sure there are cleaner, more elegant solutions, but this was mine:

sigLevel = 0.05

X_opt = X[:,[0,1,2,3,4,5]]
regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()
regressor_OLS.summary()
pVals = regressor_OLS.pvalues

while np.argmax(pVals) > sigLevel:
    droppedDimIndex = np.argmax(regressor_OLS.pvalues)
    keptDims = list(range(len(X_opt[0])))
    keptDims.pop(droppedDimIndex)
    print("pval of dim removed: " + str(np.argmax(pVals)))
    X_opt = X_opt[:,keptDims]
    regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()
    pVals = regressor_OLS.pvalues
    print(str(len(pVals)-1) + " dimensions remaining...")
    print(pVals)

regressor_OLS.summary()

Thank you Keith for your answer, Just some small fixes on Keith's loop to make it more efficient:

sigLevel = 0.05
X_opt = X[:,[0,1,2,3,4,5]]
regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()
pVals = regressor_OLS.pvalues

while pVals[np.argmax(pVals)] > sigLevel:
     X_opt = np.delete(X_opt, np.argmax(pVals), axis = 1)
     print("pval of dim removed: " + str(np.argmax(pVals)))
     print(str(X_opt.shape[1]) + " dimensions remaining...")
     regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()
     pVals = regressor_OLS.pvalues

regressor_OLS.summary()
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!