r-caret

Fully reproducible parallel models using caret

北战南征 提交于 2019-12-17 07:02:59
问题 When I run 2 random forests in caret, I get the exact same results if I set a random seed: library(caret) library(doParallel) set.seed(42) myControl <- trainControl(method='cv', index=createFolds(iris$Species)) set.seed(42) model1 <- train(Species~., iris, method='rf', trControl=myControl) set.seed(42) model2 <- train(Species~., iris, method='rf', trControl=myControl) > all.equal(predict(model1, type='prob'), predict(model2, type='prob')) [1] TRUE However, if I register a parallel back-end to

How does Caret generate an OLS model with K-fold cross validation?

本秂侑毒 提交于 2019-12-14 03:59:05
问题 Let's say I have some generic dataset for which an OLS regression is the best choice. So, I generate a model with some first-order terms and decide to use Caret in R for my regression coefficient estimates and error estimates. In caret, this ends up being: k10_cv = trainControl(method="cv", number=10) ols_model = train(Y ~ X1 + X2 + X3, data = my_data, trControl = k10_cv, method = "lm") From there, I can pull out regression information using summary(ols_model) and can also pull some more

Run cforest with controls = cforest_unbiased() using caret package

左心房为你撑大大i 提交于 2019-12-14 03:49:56
问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 6 years ago . I would like to run an unbiased cforest using the caret package. Is this possible? tc <- trainControl(method="cv", number=f, index=indexList, savePredictions=T, classProbs = TRUE, summaryFunction = twoClassSummary) createCfGrid <- function(len, data) { g = createGrid("cforest", len, data) g = expand.grid(.controls = cforest_unbiased(mtry = 5, ntree = 1000)) return(g) } set.seed

Test set and train set for each fold in Caret cross validation

拟墨画扇 提交于 2019-12-14 02:02:20
问题 I tried to understand the 5 fold cross validation algorithm in Caret package but I could not find out how to get train set and test set for each fold and I also could not find this from the similar suggested questions. Imagine if I want to do cross validation by random forest method, I do the following: set.seed(12) train_control <- trainControl(method="cv", number=5,savePredictions = TRUE) rfmodel <- train(Species~., data=iris, trControl=train_control, method="rf") first_holdout <- subset

R using my own model in RFE(recursive feature elimination) to pick important feature

落花浮王杯 提交于 2019-12-13 21:01:13
问题 Using RFE, you can get a importance rank of the features, but right now I can only use the model and parameter inner the package like: lmFuncs(linear model),rfFuncs(random forest) it seems that caretFuncs can do some custom settings for your own model and parameter,but I don't know the details and the formal document didn't give detail, I want to apply svm and gbm to this RFE process,because this is the current model I used to train, anyone has any idea? 回答1: I tried to recreate working

Caret - is it possible to save each models from tuning?

别等时光非礼了梦想. 提交于 2019-12-13 16:27:13
问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 5 years ago . I'm using caret to train models over resamples and tune learning parameters, and I can interrogate the probabilities for each test, which is great. But I'm also keen to retain the model objects and use them later without retraining -- is this possible? Basically rather than just the mdl$finalModel object, I'd like the model object for each iteration of tuning. 回答1: Not really.

Caret nnet: logloss not working for twoClassSummary

时光怂恿深爱的人放手 提交于 2019-12-13 13:40:45
问题 I have a training dataset Out Revolver Ratio Num ... 0 1 0.766127 0.802982 0 ... 1 0 0.957151 0.121876 1 2 0 0.658180 0.085113 0 3 0 0.233810 0.036050 3 4 1 0.907239 0.024926 5 The outcome variable Out is binary and only takes on the values 0 or 1. Num is not a factor I then attempted to run nnet using caret . I want to eventually try nnGrid but I just want to make sure this works first: nnTrControl=trainControl(method = "cv", classProbs = TRUE, summaryFunction = twoClassSummary, number = 2

Caret package not available for version 3.4.2

冷暖自知 提交于 2019-12-13 04:33:10
问题 I'm trying to install the caret package on R, yet I get an error message saying that package ‘caret’ is not available (for R version 3.4.2). Is there any way around this? 回答1: R CMD build (via r-devel) added the higher requirement of 3.5.0 with this message: Added dependency on R >= 3.5.0 because serialized objects in serialize/load version 3 cannot be read in older versions of R. File(s) containing such objects: caret/inst/models/models.RData 来源: https://stackoverflow.com/questions/55383673

r-caret: custom model with n-dimensional output

江枫思渺然 提交于 2019-12-13 00:11:35
问题 I am trying to fit an autoencoder model using caret. I run into problems because my output will be more than 1-dimensional (e.g. y in the call to train() will be a matrix in my case). Is there a way to account for output of higher dimension in caret? 回答1: Right now, it only handles univariate outcomes. I'm not sure why you would be using train for an autoencoder though; those are usually used as a pre-processing step rather than the outcome itself. 来源: https://stackoverflow.com/questions

caret train binary glm fails on parallel cluster via doParallel

梦想与她 提交于 2019-12-12 22:58:37
问题 I have seen there are a lot of questions around this topic already but none seems to give a satisfying answer to my problem. I intend to use caret::train() in combination with library doParallel on a Windows machine. The documentation (The caret package: 9 Parallel Processing) tells me that it will run in parallel by default if it finds a registered cluster (although it uses library doMC ). When I attempt setting up a cluster with doParallel and follow the example calculation in its