r-caret

createTimeSlices function in CARET package in R

。_饼干妹妹 提交于 2019-11-29 08:38:56
问题 I am working with multivariate financial time series data and having problems using the createTimeSlices function. I cannot find any use of the function except the one used by Max Kuhn. Can anybody help me in understanding the usage of the function? 回答1: The documentation is being "improved" on this feature (in other words, it currently sucks). Another person contacted me recently about this and here is the example: library(caret) library(ggplot2) data(economics) myTimeControl <- trainControl

Saving and loading a model in R

风格不统一 提交于 2019-11-29 05:13:54
问题 When working with caret, how can I save a model after training, and load it later (e.g. in a different session) for prediction? 回答1: A better solution nowadays is to use saveRDS to save and readRDS to read: saveRDS(model, "model.rds") my_model <- readRDS("model.rds") This lets you to choose a new name for the object (you don't need to remember the name you used when you saved it) 回答2: The correct syntax would be to use: save(model, file="model.Rdata") Thereafter, it can be loaded using the

caret::train: specify model-generation-parameters

放肆的年华 提交于 2019-11-29 04:01:07
问题 I'm using the caret library in R for model generation. I want to generate an earth (aka MARS) model and I want to specify the degree parameter for this model generation. According to the documentation (page 11) the earth method supports this parameter. I get the following error message when specifying the parameter: > library(caret) > data(trees) > train(Volume~Girth+Height, data=trees, method='earth', degree=1) Error in { : task 1 failed - "formal argument "degree" matched by multiple actual

Error in ConfusionMatrix the data and reference factors must have the same number of levels

青春壹個敷衍的年華 提交于 2019-11-28 21:18:28
I've trained a tree model with R caret. I'm now trying to generate a confusion matrix and keep getting the following error: Error in confusionMatrix.default(predictionsTree, testdata$catgeory) : the data and reference factors must have the same number of levels prob <- 0.5 #Specify class split singleSplit <- createDataPartition(modellingData2$category, p=prob, times=1, list=FALSE) cvControl <- trainControl(method="repeatedcv", number=10, repeats=5) traindata <- modellingData2[singleSplit,] testdata <- modellingData2[-singleSplit,] treeFit <- train(traindata$category~., data=traindata,

Applying k-fold Cross Validation model using caret package

余生颓废 提交于 2019-11-28 17:40:39
Let me start by saying that I have read many posts on Cross Validation and it seems there is much confusion out there. My understanding of that it is simply this: Perform k-fold Cross Validation i.e. 10 folds to understand the average error across the 10 folds. If acceptable then train the model on the complete data set. I am attempting to build a decision tree using rpart in R and taking advantage of the caret package. Below is the code I am using. # load libraries library(caret) library(rpart) # define training control train_control<- trainControl(method="cv", number=10) # train the model

Predicting Probabilities for GBM with caret library

淺唱寂寞╮ 提交于 2019-11-28 11:28:29
A similar question was asked however the link in the answer points to random forest example, it doesn't seem to work in my case. Here is an example what I'm trying to do: gbmGrid <- expand.grid(interaction.depth = c(5, 9), n.trees = (1:3)*200, shrinkage = c(0.05, 0.1)) fitControl <- trainControl( method = "cv", number = 3, classProbs = TRUE) gbmFit <- train(strong~.-Id-PlayerName, data = train[1:10000,], method = "gbm", trControl = fitControl, verbose = TRUE, tuneGrid = gbmGrid) gbmFit Everything goes fine, I get the best parameters. Now if I do the prediction: predictStrong = predict(gbmFit,

parRF on caret not working for more than one core

故事扮演 提交于 2019-11-28 09:26:54
parRF from the caret R package is not working for me with more than one core, which is quite ironic, given the par in parRF stands for parallel. I'm on a windows machine, if that is a relevant piece of information. I checked that I'm using the latest an greatest regarding caret and doParallel. I made a minimal example and and give the results below. Any ideas? Source code library(caret) library(doParallel) trCtrl <- trainControl( method = "repeatedcv" , number = 2 , repeats = 5 , allowParallel = TRUE ) # WORKS registerDoParallel(1) train(form = Species~., data=iris, trControl = trCtrl, method=

Warning message: “missing values in resampled performance measures” in caret train() using rpart

主宰稳场 提交于 2019-11-28 08:59:18
I am using the caret package to train a model with "rpart" package; tr = train(y ~ ., data = trainingDATA, method = "rpart") Data has no missing values or NA's, but when running the command a warning message comes up; Warning message: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, : There were missing values in resampled performance measures. Does anyone know (or could point me to where to find an answer) what does this warning mean? I know it is telling me that there were missing values in resampled performance measures - but what does that exactly mean and how can a

Error when I try to predict class probabilities in R - caret

耗尽温柔 提交于 2019-11-28 08:06:49
I've build a model using caret. When the training was completed I got the following warning: Warning message: In train.default(x, y, weights = w, ...) : At least one of the class levels are not valid R variables names; This may cause errors if class probabilities are generated because the variables names will be converted to: X0, X1 The names of the variables are: str(train) 'data.frame': 7395 obs. of 30 variables: $ alchemy_category : Factor w/ 13 levels "arts_entertainment",..: 2 8 6 6 11 6 1 6 3 8 ... $ alchemy_category_score : num 3737 2052 4801 3816 3179 ... $ avglinksize : num 2.06 3.68

Applying k-fold Cross Validation model using caret package

霸气de小男生 提交于 2019-11-27 20:10:44
问题 Let me start by saying that I have read many posts on Cross Validation and it seems there is much confusion out there. My understanding of that it is simply this: Perform k-fold Cross Validation i.e. 10 folds to understand the average error across the 10 folds. If acceptable then train the model on the complete data set. I am attempting to build a decision tree using rpart in R and taking advantage of the caret package. Below is the code I am using. # load libraries library(caret) library