Interpreting Alias table testing multicollinearity of model in R

巧了我就是萌 提交于 2019-12-21 06:21:24

问题


Could someone help me interpret the alias function output for testing for multicollinearity in a multiple regression model. I know some predictor variables in my model are highly correlated, and I want to identify them using the alias table.

Model :
Score ~ Comments + Pros + Cons + Advice + Response + Value + Recommendation 
+ 6Months + 12Months + 2Years + 3Years + Daily + Weekly + Monthly

Complete :
            (Intercept) Comments Pros Cons Advice Response Value1
UseMonthly1      0           0    0    0    0      0          0                
             Recommendation1 6Months1 12Months1 2Years1
UseMonthly1   0               1        1       1             
             3Years1 Daily1 Weekly1
UseMonthly1  1         -1        -1    

Value, Recommendation, 6Months, 12Months, 2Years, 3Years, Daily, Weekly, and Monthly are binary categorical variables.
Score, Comments, Pros, Cons, Advice, and Response are numeric variables.

Can I assume UseMonthly is highly correlated with 6Months, 12Months, 2Years, 3Years, Daily, Weekly? What is the difference between the 1 and -1 values in the alias output? Is it positive and negative correlation?


回答1:


Nonzero entries in the "complete" matrix show that those terms are linearly dependent on UseMonthly. This means they're highly correlated, but terms can be highly correlated without being linearly dependent.

If your purpose is to identify and remove correlated variables, you should remove UseMonthly, but you'll probably also want to remove others as well. A common way to identify variables which can be problematic with respect to multicollinearity is to search for large variance inflation factors (calculated by e.g. car::vif).



来源:https://stackoverflow.com/questions/45328783/interpreting-alias-table-testing-multicollinearity-of-model-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!