Stepwise Regression in Python

后端 未结 7 969
青春惊慌失措
青春惊慌失措 2020-12-24 14:26

How to perform stepwise regression in python? There are methods for OLS in SCIPY but I am not able to do stepwise. Any help in this regard

7条回答
  •  谎友^
    谎友^ (楼主)
    2020-12-24 14:53

    """Importing the api class from statsmodels"""
    import statsmodels.formula.api as sm
    
    """X_opt variable has all the columns of independent variables of matrix X 
    in this case we have 5 independent variables"""
    X_opt = X[:,[0,1,2,3,4]]
    
    """Running the OLS method on X_opt and storing results in regressor_OLS"""
    regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()
    regressor_OLS.summary()
    

    Using the summary method, you can check in your kernel the p values of your variables written as 'P>|t|'. Then check for the variable with the highest p value. Suppose x3 has the highest value e.g 0.956. Then remove this column from your array and repeat all the steps.

    X_opt = X[:,[0,1,3,4]]
    regressor_OLS = sm.OLS(endog = y, exog = X_opt).fit()
    regressor_OLS.summary()
    

    Repeat these methods until you remove all the columns which have p value higher than the significance value(e.g 0.05). In the end your variable X_opt will have all the optimal variables with p values less than significance level.

提交回复
热议问题