glmnet

R vector size limit: “long vectors (argument 5) are not supported in .C”

好久不见. 提交于 2019-12-02 21:03:58
I have a very large matrix I'm trying to run through glmnet on a server with plenty of memory. It works fine even on very large data sets up to a certain point, after which I get the following error: Error in elnet(x, ...) : long vectors (argument 5) are not supported in .C If I understand correctly this is caused by a limitation in R which cannot have any vector with length longer than INT_MAX. Is that correct? Are there any available solutions to this that don't require a complete rewrite of glmnet? Do any of the alternative R interpreters (Riposte, etc) address this limitation? Thanks!

glmnet: How do I know which factor level of my response is coded as 1 in logistic regression

こ雲淡風輕ζ 提交于 2019-12-01 18:10:41
I have a logistic regression model that I made using the glmnet package. My response variable was coded as a factor, the levels of which I will refer to as "a" and "b". The mathematics of logistic regression label one of the two classes as "0" and the other as "1". The feature coefficients of a logistic regression model are either positive, negative, or zero. If a feature "f"'s coefficient is positive, then increasing the value of "f" for a test observation x increases the probability that the model classifies x as being of class "1". My question is: Given a glmnet model, how do you know how

Extract the coefficients for the best tuning parameters of a glmnet model in caret

假装没事ソ 提交于 2019-12-01 06:01:11
I am running elastic net regularization in caret using glmnet . I pass sequence of values to trainControl for alpha and lambda, then I perform repeatedcv to get the optimal tunings of alpha and lambda. Here is an example where the optimal tunings for alpha and lambda are 0.7 and 0.5 respectively: age <- c(4, 8, 7, 12, 6, 9, 10, 14, 7, 6, 8, 11, 11, 6, 2, 10, 14, 7, 12, 6, 9, 10, 14, 7) gender <- make.names(as.factor(c(1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1))) bmi_p <- c(0.86, 0.45, 0.99, 0.84, 0.85, 0.67, 0.91, 0.29, 0.88, 0.83, 0.48, 0.99, 0.80, 0.85, 0.50, 0

Error - Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs)= etc

江枫思渺然 提交于 2019-12-01 04:24:27
问题 Getting an error when using glmnet in Caret Example below Load Libraries library(dplyr) library(caret) library(C50) Load churn data set from library C50 data(churn) create x and y variables churn_x <- subset(churnTest, select= -churn) churn_y <- churnTest[[20]] Use createFolds() to create 5 CV folds on churn_y, the target variable myFolds <- createFolds(churn_y, k = 5) Create trainControl object: myControl myControl <- trainControl( summaryFunction = twoClassSummary, classProbs = TRUE, #

How does glmnet's standardize argument handle dummy variables?

帅比萌擦擦* 提交于 2019-11-30 11:06:31
问题 In my dataset I have a number of continuous and dummy variables. For analysis with glmnet, I want the continuous variables to be standardized but not the dummy variables. I currently do this manually by first defining a dummy vector of columns that have only values of [0,1] and then using the scale command on all the non-dummy columns. Problem is, this isn't very elegant. But glmnet has a built in standardize argument. By default will this standardize the dummies too? If so, is there an

Can't run glmnet() R package : “ could not find function ”lengths“ ”

可紊 提交于 2019-11-30 09:42:56
问题 I'm using glmnet R package. And before today I had no problems using it. I installed caret two days ago, I had some troubles to install it but I succeeded to do some by re-installing some packages like. Here is the error message I get : Error in .fixupDimnames(.Object@Dimnames) : could not find function "lengths" I'm using an old version of R, that I can't update right now. sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=French

executing glmnet in parallel in R

岁酱吖の 提交于 2019-11-30 03:51:11
My training dataset has about 200,000 records and I have 500 features. (These are sales data from a retail org). Most of the features are 0/1 and is stored as a sparse matrix. The goal is to predict the probability to buy for about 200 products. So, I would need to use the same 500 features to predict the probability of purchase for 200 products. Since glmnet is a natural choice for model creation, I thought about implementing glmnet in parallel for the 200 products. (Since all the 200 models are independent) But I am stuck using foreach. The code I executed was: foreach(i = 1:ncol(target))

How does glmnet's standardize argument handle dummy variables?

左心房为你撑大大i 提交于 2019-11-29 23:14:22
In my dataset I have a number of continuous and dummy variables. For analysis with glmnet, I want the continuous variables to be standardized but not the dummy variables. I currently do this manually by first defining a dummy vector of columns that have only values of [0,1] and then using the scale command on all the non-dummy columns. Problem is, this isn't very elegant. But glmnet has a built in standardize argument. By default will this standardize the dummies too? If so, is there an elegant way to tell glmnet's standardize argument to skip dummies? In short, yes - this will standardize the

Why is it inadvisable to get statistical summary information for regression coefficients from glmnet model?

橙三吉。 提交于 2019-11-29 20:52:04
I have a regression model with binary outcome. I fitted the model with glmnet and got the selected variables and their coefficients. Since glmnet doesn't calculate variable importance, I would like to feed the exact output (selected variables and their coefficients) to glm to get the information (Standard errors, etc). I searched r documents, it seems I can use "method" option in glm to specify user defined function. But I failed to do so, could someone help me with this? "It is a very natural question to ask for standard errors of regression coefficients or other estimated quantities. In

Can't run glmnet() R package : “ could not find function ”lengths“ ”

怎甘沉沦 提交于 2019-11-29 16:17:06
I'm using glmnet R package. And before today I had no problems using it. I installed caret two days ago, I had some troubles to install it but I succeeded to do some by re-installing some packages like. Here is the error message I get : Error in .fixupDimnames(.Object@Dimnames) : could not find function "lengths" I'm using an old version of R, that I can't update right now. sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 [4] LC_NUMERIC=C LC_TIME=French