glmnet | 易学教程

Lasso error in glmnet NA/NaN/Inf

阅读更多关于 Lasso error in glmnet NA/NaN/Inf

问题 I'm having an issue with glmnet in that I keep getting the error message "Error in elnet(x, is.sparse, ix, jx, y, weights, offset, type.gaussian, : NA/NaN/Inf in foreign function call (arg 5) In addition: Warning message: In elnet(x, is.sparse, ix, jx, y, weights, offset, type.gaussian, : NAs introduced by coercion" Below I can replicate the error with the 'iris' data set, but here is the simplified code for my particular data: vars <- as.matrix(ind.vars) lasso <- glmnet(vars, y=cup98$TARGET

Extract data from glmnet output data

阅读更多关于 Extract data from glmnet output data

问题 I am trying to do feature selection using the glmnet package. I have been about to run the glmnet. However, I have a tough time understanding the output. My goal is to get the list of genes and their respective coefficients so I can rank the list of gene based on how relevant they are at separating my two group of labels. x = manual_normalized_melt[,colnames(manual_normalized_melt) %in% sig_0_01_ROTS$Gene] y = cellID_reference$conditions glmnet_l0 <- glmnet(x = as.matrix(x), y = y, family =

Using glmnet to predict a continuous variable in a dataset

阅读更多关于 Using glmnet to predict a continuous variable in a dataset

问题 I have this data set. wbh I wanted to use the R package glmnet to determine which predictors would be useful in predicting fertility. However, I have been unable to do so, most likely due to not having a full understanding of the package. The fertility variable is SP.DYN.TFRT.IN. I want to see which predictors in the data set give the most predictive power for fertility. I wanted to use LASSO or ridge regression to shrink the number of coefficients, and I know this package can do that. I'm

Error in `[<-`(`tmp`, , subscript out of bounds subscript out of bounds

阅读更多关于 Error in `[

问题 In the following code, I am trying to create a matrix that will list off the opt.lam for each city. Upon running the loop, the first two cities always work, and then I get an error for any cities after that. This is the error that I get. (coefmatrix works fine, it's just the lambdamatrix that produces this error). Error in [<- ( *tmp* , , i, value = c(0.577199381062121, 0.577199381062121, : subscript out of bounds Here is my code: lambdamatrix <- matrix(nrow=n,ncol=2) rownames(lambdamatrix) <

cv.glmnet fails for ridge, not lasso, for simulated data with coder error

阅读更多关于 cv.glmnet fails for ridge, not lasso, for simulated data with coder error

问题 Gist The error: Error in predmat[which, seq(nlami)] = preds : replacement has length zero The context: data is simulated with a binary y, but there are n coders of true y . the data is stacked n times and a model is fitted, trying to get true y . The error is received for L2 penalty, but not L1 penalty. when Y is the coder Y, but not when it is the true Y. the error is not deterministic, but depends on seed. UPDATE: the error is for versions after 1.9-8. 1.9-8 does not fail. Reproduction base

Ridge-regression model: glmnet

阅读更多关于 Ridge-regression model: glmnet

问题 Fitting a linear-regression model using least squares on my training dataset works fine. library(Matrix) library(tm) library(glmnet) library(e1071) library(SparseM) library(ggplot2) trainingData <- read.csv("train.csv", stringsAsFactors=FALSE,sep=",", header = FALSE) testingData <- read.csv("test.csv",sep=",", stringsAsFactors=FALSE, header = FALSE) lm.fit = lm(as.factor(V42)~ ., data = trainingData) linearMPrediction = predict(lm.fit,newdata = testingData, se.fit = TRUE) mean(

Glmnet is different with intercept=TRUE compared to intercept=FALSE and with penalty.factor=0 for an intercept in x

阅读更多关于 Glmnet is different with intercept=TRUE compared to intercept=FALSE and with penalty.factor=0 for an intercept in x

问题 I am new to glmnet and playing with the penalty.factor option. The vignette says that it "Can be 0 for some variables, which implies no shrinkage, and that variable is always included in the model." And the longer PDF document has code. So I expected that running a regression with intercept = TRUE and no constant in x would be the same as with intercept = FALSE and a constant in x with penalty.factor = 0 . But the code below shows that it is not: the latter case has an intercept of 0 and the

glmnet error for logistic regression/binomial

阅读更多关于 glmnet error for logistic regression/binomial

I get this error when trying to fit glmnet() with family="binomial", for Logistic Regression fit: > data <- read.csv("DAFMM_HE16_matrix.csv", header=F) > x <- as.data.frame(data[,1:3]) > x <- model.matrix(~.,data=x) > y <- data[,4] > train=sample(1:dim(x)[1],287,replace=FALSE) > xTrain=x[train,] > xTest=x[-train,] > yTrain=y[train] > yTest=y[-train] > fit = glmnet(xTrain,yTrain,family="binomial") Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : one multinomial or binomial class has 1 or 0 observations; not allowed Any help would be greatly appreciated - I've searched

Running glmnet package in R, getting error “missing value where TRUE/FALSE needed”, maybe due to missing values?

阅读更多关于 Running glmnet package in R, getting error “missing value where TRUE/FALSE needed”, maybe due to missing values?

I am trying to use glmnet from the glmnet package to run a LASSO regression. I am using the following command: library(glmnet) glmnet(a,b,family="binomial",alpha=1) And am getting the error: > Error in if (!all(o)) { : missing value where TRUE/FALSE needed a is a matrix, with numerical values. b is a vector with a factor as values. However, b has some missing values. I am suspecting this might be what is causing the error. However, I don't see an option to exclude NA s in the glmnet documentation. Since glmnet doesn't accept the full data frame with a formula (and thus no na.omit), but uses

Why calculating MSE in lasso regression gives different outputs?

阅读更多关于 Why calculating MSE in lasso regression gives different outputs?

I am trying to run different regression models on the Prostate cancer data from the lasso2 package. When I use Lasso, I saw two different methods to calculate the mean square error. But they do give me quite different results, so I would want to know if I'm doing anything wrong or if it just means that one method is better than the other ? # Needs the following R packages. library(lasso2) library(glmnet) # Gets the prostate cancer dataset data(Prostate) # Defines the Mean Square Error function mse = function(x,y) { mean((x-y)^2)} # 75% of the sample size. smp_size = floor(0.75 * nrow(Prostate)