Ridge regression in glmnet in R; Calculating VIF for different lambda values using glmnet package

匆匆过客 提交于 2020-01-24 13:04:20

问题


I have a set of multicollinear variables and I'm trying to use ridge regression to tackle that. I am using the GLMNET package in R with alpha = 0 (for ridge regression).

library(glmnet)

I have a sequence of lambda values; and I am choosing the best lambda value through cv.glmnet

lambda <- 10^seq(10, -2, length = 100)

-- creating model matrix and assigning the y variable

x <- model.matrix(dv ~ ., datamatrix) [,-1]
y <- datamatrix$dv

-- Using cross validation to determine the best lambda and predicting y using that lambda value

ridge.mod <- glmnet(x, y, alpha = 0, lambda = lambda)
cv.out <- cv.glmnet(x, y, alpha = 0)
ridge.pred <- predict(ridge.mod, s = cv.out$lambda.min, newx = x)

I am able to successfully do till this point, but I have to also check for the VIF for this particular lambda value to ensure that the coefficients have stabilized and the multicollinearity is controlled. But I am not sure how to check for VIF in GLMNET since the usual vif() function throws this error.

Error in vcov.default(mod) : there is no vcov() method for models of class elnet, glmnet

Could you please help me identify if there is anything wrong in my approach or how to solve this issue?

Is VIF not applicable for validation in GLMNET?

Thanks in advance.


回答1:


VIF is a property of set of independent variables only. It doesn't matter what dependent variable is and what kind of model you use (linear regression, generalized model) as long as it doesn't change indeperndent variables (as e.g. additive model does). See vif function from car package. So, VIF applied to elastic net regression, won't tell you if you have dealt with multicollinearity. It can just tell you that there was a multicollinearity to deal with.




回答2:


Hadi Regression Analysis by Examples (p295) has the following ridge regression definition of the VIF. Z is the standardized version of the covariate matrix.




回答3:


The function car::vif will not work on objects resulting from a an lm fit. You could potentially extract the column names from the glmnet fit and refit with lm. Then run vif on the new fit.

This code should work.

library(car)
library(glmnet)

cvfit <- cv.glmnet(train.x, train.y, 
                   family = "binomial", 
                   type.measure = "class", 
                   nlambda = 1000)

tmp_coeffs <- coef(cvfit, s = "lambda.min")
# get coef names
columns <- as.character(
                 data.frame(
                      name = tmp_coeffs@Dimnames[[1]][tmp_coeffs@i + 1],
                      coefficient = tmp_coeffs@x)[, 'name']
                      )

# create formula from fit
logistic_reduced <- as.formula(paste("outcome ~ ",
                                  paste(columns[-1], collapse = " + "),
                                  sep = ""))
# refit logistic
new.fit <- lm(logistic_reduced, 
              family=binomial(link='logit'), 
              data = train)

# get vif
vif(new.fit)



来源:https://stackoverflow.com/questions/44862009/ridge-regression-in-glmnet-in-r-calculating-vif-for-different-lambda-values-usi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!