Variance Inflation Factor in Python

前端 未结 8 592
星月不相逢
星月不相逢 2020-12-22 23:04

I\'m trying to calculate the variance inflation factor (VIF) for each column in a simple dataset in python:

a b c d
1 2 4 4
1 2 6 3
2 3 7 4
3 2 8 5
4 1 9 4
         


        
8条回答
  •  悲&欢浪女
    2020-12-22 23:30

    Example for Boston Data:

    VIF is calculated by auxiliary regression, so not dependent on the actual fit.

    See below:

    from patsy import dmatrices
    from statsmodels.stats.outliers_influence import variance_inflation_factor
    import statsmodels.api as sm
    
    # Break into left and right hand side; y and X
    y, X = dmatrices(formula="medv ~ crim + zn + nox + ptratio + black + rm ", data=boston, return_type="dataframe")
    
    # For each Xi, calculate VIF
    vif = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]
    
    # Fit X to y
    result = sm.OLS(y, X).fit()
    

提交回复
热议问题