r-caret | 易学教程

How to pass a character vector in the train function caret R

阅读更多关于 How to pass a character vector in the train function caret R

问题 I want to reduce the number of variables when i train my model. I have a total of 784 features that I want to reduce to lets say 500. I can make a long string with the selected featuees with the Paste command collapsed with + to have a long string. For example, lets say this is my vector val <- "pixel40+pixel46+pixel48+pixel65+pixel66+pixel67" then I would like to pass it to the train function like so Rf_model <- train(label~val, data =training, method="rf", ntree=200, na.action=na.omit) but

Extract the coefficients for the best tuning parameters of a glmnet model in caret

阅读更多关于 Extract the coefficients for the best tuning parameters of a glmnet model in caret

I am running elastic net regularization in caret using glmnet . I pass sequence of values to trainControl for alpha and lambda, then I perform repeatedcv to get the optimal tunings of alpha and lambda. Here is an example where the optimal tunings for alpha and lambda are 0.7 and 0.5 respectively: age <- c(4, 8, 7, 12, 6, 9, 10, 14, 7, 6, 8, 11, 11, 6, 2, 10, 14, 7, 12, 6, 9, 10, 14, 7) gender <- make.names(as.factor(c(1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1))) bmi_p <- c(0.86, 0.45, 0.99, 0.84, 0.85, 0.67, 0.91, 0.29, 0.88, 0.83, 0.48, 0.99, 0.80, 0.85, 0.50, 0

Creating a data partition using caret and data.table

阅读更多关于 Creating a data partition using caret and data.table

I have a data.table in R which I want to use with caret package set.seed(42) trainingRows<-createDataPartition(DT$variable, p=0.75, list=FALSE) head(trainingRows) # view the samples of row numbers However, I am not able to select the rows with data.table. Instead I had to convert to a data.frame DT_df <-as.data.frame(DT) DT_train<-DT_df[trainingRows,] dim(DT_train) the data.table alternative DT_train <- DT[.(trainingRows),] requires the keys to be set. Any better option other than converting to data.frame? Bruce Pucci Roll you own inTrain <- sample(MyDT[, .I], floor(MyDT[, .N] * .75)) Train <-

Error - Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs)= etc

阅读更多关于 Error - Error in lognet(x, is.sparse, ix, jx, y, weights, offset, alpha, nobs)= etc

问题 Getting an error when using glmnet in Caret Example below Load Libraries library(dplyr) library(caret) library(C50) Load churn data set from library C50 data(churn) create x and y variables churn_x <- subset(churnTest, select= -churn) churn_y <- churnTest[[20]] Use createFolds() to create 5 CV folds on churn_y, the target variable myFolds <- createFolds(churn_y, k = 5) Create trainControl object: myControl myControl <- trainControl( summaryFunction = twoClassSummary, classProbs = TRUE, #

Using neuralnet with caret train and adjusting the parameters

阅读更多关于 Using neuralnet with caret train and adjusting the parameters

So I've read a paper that had used neural networks to model out a dataset which is similar to a dataset I'm currently using. I have 160 descriptor variables that I want to model out for 160 cases (regression modelling). The paper I read used the following parameters:- 'For each split, a model was developed for each of the 10 individual train-test folds. A three layer back-propagation net with 33 input neurons and 16 hidden neurons was used with online weight updates, 0.25 learning rate, and 0.9 momentum. For each fold, learning was conducted from a total of 50 different random initial weight

R caret / rfe variable selection for factors() AND NAs

阅读更多关于 R caret / rfe variable selection for factors() AND NAs

I have a data set with NAs sprinkled generously throughout. In addition it has columns that need to be factors() . I am using the rfe() function from the caret package to select variables. It seems the functions= argument in rfe() using lmFuncs works for the data with NAs but NOT on factor variables, while the rfFuncs works for factor variables but NOT NAs. Any suggestions for dealing with this? I tried model.matrix() but it seems to just cause more problems. Because of inconsistent behavior on these points between packages, not to mention the extra trickiness when going to more "meta"

Dummy variables and preProcess

阅读更多关于 Dummy variables and preProcess

I have a data frame with some dummy variables that I want to use as training set for glmnet . Since I'm using glmnet I want to center and scale the features using the preProcess option in the caret train function. I don't want that this transformation is applied also to the dummy variables. Is there a way to prevent the transformation of these variables? There's not (currently) a way to do this besides writing a custom model to do so (see the example with PLS and RF near the end). I'm working on a method to specify which variables get which pre-processing method. However, with dummy variables,

Using neuralnet with caret train and adjusting the parameters

阅读更多关于 Using neuralnet with caret train and adjusting the parameters

问题 So I've read a paper that had used neural networks to model out a dataset which is similar to a dataset I'm currently using. I have 160 descriptor variables that I want to model out for 160 cases (regression modelling). The paper I read used the following parameters:- 'For each split, a model was developed for each of the 10 individual train-test folds. A three layer back-propagation net with 33 input neurons and 16 hidden neurons was used with online weight updates, 0.25 learning rate, and 0

How to preProcess features when some of them are factors?

阅读更多关于 How to preProcess features when some of them are factors?

My question is related to this one regarding categorical data (factors in R terms) when using the Caret package. I understand from the linked post that if you use the "formula interface", some features can be factors and the training will work fine. My question is how can I scale the data with the preProcess() function? If I try and do it on a data frame with some columns as factors, I get this error message: Error in preProcess.default(etitanic, method = c("center", "scale")) : all columns of x must be numeric See here some sample code: library(earth) data(etitanic) a <- preProcess(etitanic,

How to specify a validation holdout set to caret

阅读更多关于 How to specify a validation holdout set to caret

I really like using caret for at least the early stages of modeling, especially for it's really easy to use resampling methods. However, I'm working on a model where the training set has a fair number of cases added via semi-supervised self-training and my cross-validation results are really skewed because of it. My solution to this is using a validation set to measure model performance but I can't see a way use a validation set directly within caret - am I missing something or this just not supported? I know that I can write my own wrappers to do what caret would normally do for m, but it