Screening (multi)collinearity in a regression model

后端 未结 5 2043
南方客
南方客 2020-12-12 09:49

I hope that this one is not going to be \"ask-and-answer\" question... here goes: (multi)collinearity refers to extremely high correlations between predictors in the regress

5条回答
  •  没有蜡笔的小新
    2020-12-12 10:17

    See also Section 9.4 in this Book: Practical Regression and Anova using R [Faraway 2002].

    Collinearity can be detected in several ways:

    1. Examination of the correlation matrix of the predictors will reveal large pairwise collinearities.

    2. A regression of x_i on all other predictors gives R^2_i. Repeat for all predictors. R^2_i close to one indicates a problem — the offending linear combination may be found.

    3. Examine the eigenvalues of t(X) %*% X, where X denotes the model matrix; Small eigenvalues indicate a problem. The 2-norm condition number can be shown to be the ratio of the largest to the smallest non-zero singular value of the matrix ($\kappa = \sqrt{\lambda_1/\lambda_p}$; see ?kappa); \kappa >= 30 is considered large.

提交回复
热议问题