Capturing high multi-collinearity in statsmodels
Say I fit a model in statsmodels mod = smf.ols('dependent ~ first_category + second_category + other', data=df).fit() When I do mod.summary() I may see the following: Warnings: [1] The condition number is large, 1.59e+05. This might indicate that there are strong multicollinearity or other numerical problems. Sometimes the warning is different (e.g. based on eigenvalues of the design matrix). How can I capture high-multi-collinearity conditions in a variable? Is this warning stored somewhere in the model object? Also, where can I find a description of the fields in summary() ? You can detect