r-caret

R's caret training errors when y is not a factor

狂风中的少年 提交于 2019-12-02 02:12:01
I am using R-studio and am using kaggle's forest cover data and keep getting an error when trying to use the knn3 function in caret. here is my code: library(caret) train <- read.csv("C:/data/forest_cover/train.csv", header=T) trainingRows <- createDataPartition(train$Cover_Type, p=0.8, list=F) head(trainingRows) train_train <- train[trainingRows,] train_test <- train[-trainingRows,] knnfit <- knn3(train_train[,-56], train_train$Cover_Type) This last line gives me this in the console: Error in knn3.matrix(x, y = y, k = k, ...) : y must be a factor As the error message states, y must be a

R rfe function “caret” Package error: there should be the same number of samples in x and y

孤者浪人 提交于 2019-12-01 21:26:20
As I'm trying the rfe example from the "caret" package taken from here , I kept on receiving this error Error in rfe.default(d[1:2901, ], c(1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, : there should be the same number of samples in x and y This question has been asked but its solution doesn't apply in this case. Here's the code: set.seed(7) # load the library library(mlbench) library(caret) # load the data d <- read.table("d.dat") # define the control using a random forest selection function control <- rfeControl(functions=rfFuncs, method="cv", number=10) # run the RFE algorithm results <- rfe(d[1

R caret train Error in evalSummaryFunction: cannnot compute class probabilities for regression

我只是一个虾纸丫 提交于 2019-12-01 20:36:51
问题 > cv.ctrl <- trainControl(method = "repeatedcv", repeats = 3, + summaryFunction = twoClassSummary, + classProbs = TRUE) > > set.seed(35) > glm.tune.1 <- train(y ~ bool_3, + data = train.batch, + method = "glm", + metric = "ROC", + trControl = cv.ctrl) Error in evalSummaryFunction(y, trControl, classLevels, metric, method) : train()'s use of ROC codes requires class probabilities. See the classProbs option of trainControl() In addition: Warning message: In train.default(x, y, weights = w, ...)

Caret error using GBM, but not without caret

白昼怎懂夜的黑 提交于 2019-12-01 17:27:18
I've been using gbm through caret without problems, but when removing some variables from my dataframe it started to fail. I've tried with both github and cran versions of the mentioned packages. This is the error: > fitRF = train(my_data[trainIndex,vars_for_clust], clusterAssignment[trainIndex], method = "gbm", verbose=T) Something is wrong; all the Accuracy metric values are missing: Accuracy Kappa Min. : NA Min. : NA 1st Qu.: NA 1st Qu.: NA Median : NA Median : NA Mean :NaN Mean :NaN 3rd Qu.: NA 3rd Qu.: NA Max. : NA Max. : NA NA's :9 NA's :9 Error in train.default(my_data[trainIndex, vars

Caret error using GBM, but not without caret

情到浓时终转凉″ 提交于 2019-12-01 16:58:22
问题 I've been using gbm through caret without problems, but when removing some variables from my dataframe it started to fail. I've tried with both github and cran versions of the mentioned packages. This is the error: > fitRF = train(my_data[trainIndex,vars_for_clust], clusterAssignment[trainIndex], method = "gbm", verbose=T) Something is wrong; all the Accuracy metric values are missing: Accuracy Kappa Min. : NA Min. : NA 1st Qu.: NA 1st Qu.: NA Median : NA Median : NA Mean :NaN Mean :NaN 3rd

Error: nrow(x) == n is not TRUE when using Train in Caret

落爺英雄遲暮 提交于 2019-12-01 16:25:07
I have a training set that looks like Name Day Area X Y Month Night ATTACK Monday LA -122.41 37.78 8 0 VEHICLE Saturday CHICAGO -1.67 3.15 2 0 MOUSE Monday TAIPEI -12.5 3.1 9 1 Name is the outcome/dependent variable. I converted Name , Area and Day into factors, but I wasn't sure if I was supposed to for Month and Night , which only take on integer values 1-12 and 0-1, respectively. I then convert the data into matrix ynn <- model.matrix(~Name , data = trainDF) mnn <- model.matrix(~ Day+Area +X + Y + Month + Night, data = trainDF) I then setup tuning the parameters nnTrControl=trainControl

Variable importance using the caret package (error); RandomForest algorithm

回眸只為那壹抹淺笑 提交于 2019-12-01 15:06:54
I am trying to obtain the variable importance of a rf model in any way. This is the approach I have tried so far, but alternate suggestions are very welcome. I have trained a model in R: require(caret) require(randomForest) myControl = trainControl(method='cv',number=5,repeats=2,returnResamp='none') model2 = train(increaseInAssessedLevel~., data=trainData, method = 'rf', trControl=myControl) The dataset is fairly large, but the model runs fine. I can access its parts and run commands such as: > model2[3] $results mtry RMSE Rsquared RMSESD RsquaredSD 1 2 0.1901304 0.3342449 0.004586902 0

Error: nrow(x) == n is not TRUE when using Train in Caret

北城余情 提交于 2019-12-01 15:01:04
问题 I have a training set that looks like Name Day Area X Y Month Night ATTACK Monday LA -122.41 37.78 8 0 VEHICLE Saturday CHICAGO -1.67 3.15 2 0 MOUSE Monday TAIPEI -12.5 3.1 9 1 Name is the outcome/dependent variable. I converted Name , Area and Day into factors, but I wasn't sure if I was supposed to for Month and Night , which only take on integer values 1-12 and 0-1, respectively. I then convert the data into matrix ynn <- model.matrix(~Name , data = trainDF) mnn <- model.matrix(~ Day+Area

How to pass a character vector in the train function caret R

心已入冬 提交于 2019-12-01 14:08:27
I want to reduce the number of variables when i train my model. I have a total of 784 features that I want to reduce to lets say 500. I can make a long string with the selected featuees with the Paste command collapsed with + to have a long string. For example, lets say this is my vector val <- "pixel40+pixel46+pixel48+pixel65+pixel66+pixel67" then I would like to pass it to the train function like so Rf_model <- train(label~val, data =training, method="rf", ntree=200, na.action=na.omit) but I get the error model.frame.default(form = label ~ val, data = training, na.action = na.omit) Thanks!

Variable importance using the caret package (error); RandomForest algorithm

情到浓时终转凉″ 提交于 2019-12-01 13:58:35
问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 6 years ago . I am trying to obtain the variable importance of a rf model in any way. This is the approach I have tried so far, but alternate suggestions are very welcome. I have trained a model in R: require(caret) require(randomForest) myControl = trainControl(method='cv',number=5,repeats=2,returnResamp='none') model2 = train(increaseInAssessedLevel~., data=trainData, method = 'rf',