“Error in drop(y %*% rep(1, nc))” error for cv.glmnet in glmnet R package

岁酱吖の 提交于 2019-12-12 14:03:58

问题


I have a function to return the auc value for a cv.glmnet model and it often, although not the majority of the time, returns the following error when executing the cv.glmnet function:

Error in drop(y %% rep(1, nc)) : error in evaluating the argument 'x' in selecting a method for function 'drop': Error in y %% rep(1, nc) : non-conformable arguments

I've read a little bit about the error and the only suggestion I could find was to use data.matrix() instead of as.matrix(). My function is as follows (where "form" is a formula with my desired variables and "dt" is the data frame):

auc_cvnet <- function(form, dt, standard = F){
      vars = all.vars(form)
      depM = dt[[vars[1]]]
      indM = data.matrix(dt[vars[-1]])
      model = cv.glmnet(indM, depM, family = "binomial", nfolds=3, type.measure="auc", standardize = standard)

      pred = predict(model, indM, type = "response")
      tmp = prediction(pred, depM)
      auc.tmp = performance(tmp, "auc")
      return(as.numeric(auc.tmp@y.values))
    }

I'm implementing this function in another function that iterates through combinations of a few variables to see what combinations of variables work well (it's a pretty brute-force method). Anyway, I printed out the formula for the iteration when the error was thrown and called the function with just that formula and it worked fine. So unfortunately I can't pinpoint what calls throw an error, otherwise I'd try to give more information. The data frame has about 30 rows and there are no errors when I run my code on a larger data set with 110 rows. There also are no NAs in either data set.

Has anyone seen this before or have any thoughts? Thanks!


回答1:


Believe it or not, I actually got this same error today. Since I don't know your dataset, I can't say for sure what it is, but for me, the data I was passing as my y variable (your depM) was a column of all True values. cv.glmnet would only return a valid model if my y variable contained True and False values.

I wish I could explain why cv.glmnet required both True and False, but I have a lack of understanding of the function itself (as it is, I am only adapting code given to me). I just thought I'd post this in case it would give you some help troubleshooting. Good luck!




回答2:


I have the same problem when running cv.glmnet on a dataset with 2 positive cases and 850 negative ones. In one of the cross-validation iterations (where the training and testing sets are randomly sampled) both positive cases are sampled-out of the training set. Thus, glmnet calls lognet, which in turn calls drop(y %*% rep(1, nc)) but y is a vector and not a matrix with at least two columns.

The easiest way I can think of is to specify the foldid parameter to cv.glmnet and make sure that there are at least two classes present in the data in every iteration.



来源:https://stackoverflow.com/questions/24811598/error-in-dropy-rep1-nc-error-for-cv-glmnet-in-glmnet-r-package

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!