r-caret

Pass PCA preprocessing arguments to train()

落爺英雄遲暮 提交于 2019-12-03 10:32:43
I'm trying to build a predictive model in caret using PCA as pre-processing. The pre-processing would be as follows: preProc <- preProcess(IL_train[,-1], method="pca", thresh = 0.8) Is it possible to pass the thresh argument directly to caret's train() function? I've tried the following, but it doesn't work: modelFit_pp <- train(IL_train$diagnosis ~ . , preProcess="pca", thresh= 0.8, method="glm", data=IL_train) If not, how can I pass the separate preProc results to the train() function? As per the documentation, you specify additional preprocessing arguments with trainControl ?trainControl ..

caret::train: specify further non-tuning parameters for mlpWeightDecay (RSNNS package)

送分小仙女□ 提交于 2019-12-03 09:42:31
I have a problem with specifying the learning rate using the caret package with the method "mlpWeightDecay" from RSNNS package. The tuning parameters of "mlpWeightDecay" are size and decay. An example leaving size constant at 4 and tuning decay over c(0,0.0001, 0.001, 0.002): data(iris) TrainData <- iris[,1:4] TrainClasses <- iris[,5] fit1 <- train(TrainData, TrainClasses, method = "mlpWeightDecay", preProcess = c("center", "scale"), tuneGrid=expand.grid(.size = 4, .decay = c(0,0.0001, 0.001, 0.002)), trControl = trainControl(method = "cv") ) But I also want to manipulate the learning rate of

Different results with formula and non-formula for caret training

别来无恙 提交于 2019-12-03 08:36:40
I noticed that using formula and non-formula methods in caret while training produces different results. Also, the time taken for formula method is almost 10x the time taken for the non-formula method. Is this expected ? > z <- data.table(c1=sample(1:1000,1000, replace=T), c2=as.factor(sample(LETTERS, 1000, replace=T))) # SYSTEM TIME WITH FORMULA METHOD # ------------------------------- > system.time(r <- train(c1 ~ ., z, method="rf", importance=T)) user system elapsed 376.233 9.241 18.190 > r 1000 samples 1 predictors No pre-processing Resampling: Bootstrap (25 reps) Summary of sample sizes:

R: using ranger with caret, tuneGrid argument

ε祈祈猫儿з 提交于 2019-12-03 07:50:10
I'm using the caret package to analyse Random Forest models built using ranger . I can't figure out how to call the train function using the tuneGrid argument to tune the model parameters. I think I'm calling the tuneGrid argument wrong, but can't figure out why it's wrong. Any help would be appreciated. data(iris) library(ranger) model_ranger <- ranger(Species ~ ., data = iris, num.trees = 500, mtry = 4, importance = 'impurity') library(caret) # my tuneGrid object: tgrid <- expand.grid( num.trees = c(200, 500, 1000), mtry = 2:4 ) model_caret <- train(Species ~ ., data = iris, method = "ranger

Issues with tuneGrid parameter in random forest

ぐ巨炮叔叔 提交于 2019-12-03 07:14:26
I've been dealing with some extremely imbalanced data and I would like to use stratified sampling to created more balanced random forests Right now, I'm using the caret package, mainly to for tuning the random forests. So I try to setup a tuneGrid to pass in the mtry and sampsize parameters into caret train method as follows. mtryGrid <- data.frame(.mtry = 100),.sampsize=80) rfTune<- train(x = trainX, y = trainY, method = "rf", trControl = ctrl, metric = "Kappa", ntree = 1000, tuneGrid = mtryGrid, importance = TRUE) When I run this example, I get the following error The tuning parameter grid

Fit a no-intercept model in caret

守給你的承諾、 提交于 2019-12-03 05:28:12
In R, I specify a model with no intercept as follows: data(iris) lmFit <- lm(Sepal.Length ~ 0 + Petal.Length + Petal.Width, data=iris) > round(coef(lmFit),2) Petal.Length Petal.Width 2.86 -4.48 However, if I fit the same model with caret, the resulting model includes an intercept: library(caret) caret_lmFit <- train(Sepal.Length~0+Petal.Length+Petal.Width, data=iris, "lm") > round(coef(caret_lmFit$finalModel),2) (Intercept) Petal.Length Petal.Width 4.19 0.54 -0.32 How do I tell caret::train to exclude the intercept term? As discussed in a linked SO question https://stackoverflow.com/a/41731117

How to compute ROC and AUC under ROC after training using caret in R?

一曲冷凌霜 提交于 2019-12-03 05:17:59
问题 I have used caret package's train function with 10-fold cross validation. I also have got class probabilities for predicted classes by setting classProbs = TRUE in trControl , as follows: myTrainingControl <- trainControl(method = "cv", number = 10, savePredictions = TRUE, classProbs = TRUE, verboseIter = TRUE) randomForestFit = train(x = input[3:154], y = as.factor(input$Target), method = "rf", trControl = myTrainingControl, preProcess = c("center","scale"), ntree = 50) The output

Improving model training speed in caret (R)

大憨熊 提交于 2019-12-03 04:30:50
问题 I have a dataset consisting of 20 features and roughly 300,000 observations. I'm using caret to train model with doParallel and four cores. Even training on 10% of my data takes well over eight hours for the methods I've tried (rf, nnet, adabag, svmPoly). I'm resampling with with bootstrapping 3 times and my tuneLength is 5. Is there anything I can do to speed up this agonizingly slow process? Someone suggested using the underlying library can speed up my the process as much as 10x, but

SVM with cross validation in R using caret

一世执手 提交于 2019-12-03 03:25:54
问题 I was told to use the caret package in order to perform Support Vector Machine regression with 10 fold cross validation on a data set I have. I'm plotting my response variable against 151 variables. I did the following:- > ctrl <- trainControl(method = "repeatedcv", repeats = 10) > set.seed(1500) > mod <- train(RT..seconds.~., data=cadets, method = "svmLinear", trControl = ctrl) in which I got C RMSE Rsquared RMSE SD Rsquared SD 0.2 50 0.8 20 0.1 0.5 60 0.7 20 0.2 1 60 0.7 20 0.2 But I want

Feature Selection in caret rfe + sum with ROC

亡梦爱人 提交于 2019-12-03 03:19:49
I have been trying to apply recursive feature selection using caret package. What I need is that ref uses the AUC as performance measure. After googling for a month I cannot get the process working. Here is the code I have used: library(caret) library(doMC) registerDoMC(cores = 4) data(mdrr) subsets <- c(1:10) ctrl <- rfeControl(functions=caretFuncs, method = "cv", repeats =5, number = 10, returnResamp="final", verbose = TRUE) trainctrl <- trainControl(classProbs= TRUE) caretFuncs$summary <- twoClassSummary set.seed(326) rf.profileROC.Radial <- rfe(mdrrDescr, mdrrClass, sizes=subsets,