Remove highly correlated variables

前端 未结 3 1356
遥遥无期
遥遥无期 2021-01-30 05:21

I have a huge dataframe 5600 X 6592 and I want to remove any variables that are correlated to each other more than 0.99 I do know how to do this the long way, step by step i.e.

3条回答
  •  天涯浪人
    2021-01-30 05:53

    @David A small change in your code make it more robust to negative correlation , by providing

    abs(x) > 0.99 
    

    instead of only

    x > 0.99
    

    data.new <- data[,!apply(tmp,2,function(x) any(abs(x) > 0.99))]

    cheers..!!!

提交回复
热议问题