r-caret

Caret objecting to outcomes labels: Error: At least one of the class levels is not a valid R variable name

耗尽温柔 提交于 2019-12-11 15:17:42
问题 caret gives me the error below. I'm training a SVM for prediction starting from a bag of words and wanted to use caret to tune the C parameter, however: bow.model.svm.tune <- train(Training.match ~ ., data = data.frame( Training.match = factor(Training.Data.old$Training.match, labels = c('no match', 'match')), Text.features.dtm.df) %>% filter(Training.Data.old$Data.tipe == 'train'), method = 'svmRadial', tuneLength = 9, preProc = c("center","scale"), metric="ROC", trControl = trainControl(

R caret naïve bayes accuracy is null

岁酱吖の 提交于 2019-12-11 13:28:56
问题 I have one dataset to train with SVM and Naïve Bayes. SVM works, but Naïve Bayes doesn't work. Follow de source code below: library(tools) library(caret) library(doMC) library(mlbench) library(magrittr) library(caret) CORES <- 5 #Optional registerDoMC(CORES) #Optional load("chat/rdas/2gram-entidades-erro.Rda") set.seed(10) split=0.60 maFinal$resposta <- as.factor(maFinal$resposta) data_train <- as.data.frame(unclass(maFinal[ trainIndex,])) data_test <- maFinal[-trainIndex,] treegram25NotNull

How to change the cost matrix in R with caret and C5.0Cost?

落花浮王杯 提交于 2019-12-11 11:12:14
问题 I'm currently experimenting with caret and C5.0Cost in R. So far I have a base model that is working fine. But the tuning parameters give me some headaches. I seem to be unable to change the cost for the false positives. library(mlbench) data(Sonar) library(caret) set.seed(990) inTraining <- createDataPartition(Sonar$Class, p = .5, list = FALSE) inTraining training <- Sonar[inTraining,] test <- Sonar[-inTraining,] set.seed(990) fitControl <- trainControl(method="repeatedcv", number=10,

StepLDA without Cross Validation

自作多情 提交于 2019-12-11 10:06:06
问题 I would like to select the variables on the basis of the training error. For that reason I set method in trainControl to "none". However, if I run the function below twice I get two different errors (correctness rates). In this exsample the difference is not worth to mention. Even so I wouldn't have expected any difference at all. Does somebody know where this difference comes from? library(caret) c_1 <- trainControl(method = "none") maxvar <-(4) direction <-"forward" tune_1 <-data.frame

Create partition based in two variables

旧街凉风 提交于 2019-12-11 08:54:57
问题 I have a data set with two outcome variables, case1 and case2. Case1 has 4 levels, while case2 has 50 (levels in case2 could increase later). I would like to create data partition for train and test keeping the ratio in both cases. The real data is imbalanced for both case1 and case2. As an example, library(caret) set.seed(123) matris=matrix(rnorm(10),1000,20) case1 <- as.factor(ceiling(runif(1000, 0, 4))) case2 <- as.factor(ceiling(runif(1000, 0, 50))) df <- as.data.frame(matris) df$case1 <-

How to adapt datasplit sizes with createDataPartition()

给你一囗甜甜゛ 提交于 2019-12-11 04:49:08
问题 I have a question concerning datasplitting into train, test & validation with createDataPartition(). I found a solution that fits perfectly for a 60, 20, 20 split. However, I don't see a way to adapt my data splitting with it and still ensure that my data is not overlapping. I.e., I would like to split into 80, 10, 10 or whatever. library("caret") # Draw a random, stratified sample including p percent of the data idx.train <- createDataPartition(y = iris$Species, p = 0.8, list = FALSE) #

Fatal error with train() in caret on Windows 7, R 3.0.2, caret 6.0-21

随声附和 提交于 2019-12-11 00:46:08
问题 I am trying to use train() in caret to fit a classification model, but I'm hitting some kind of unhandled exception and my R session crashes before outputting any error information in the R console. Windows error: R for Windows terminal front-end has stopped working I am running Windows 7, R 3.0.2, caret 6.0-21, and have tried running this on both 32/64 versions of R, in R Studio and also directly in the R console, and am getting the same results each time. Here is my call to train: library(

Binomial GLM using caret train

﹥>﹥吖頭↗ 提交于 2019-12-10 21:26:42
问题 I would like to fit a Binomial GLM on a certain dataset. Using glm(...,family=binomial) everything works fine however I would like to do it with the caret train() function. Unfortunately I get an unexpected error which I cannot get rid of. library("marginalmodelplots") library("caret") MissUSA <- MissAmerica08[,c(2,4,6,7,8,10)] formula<-cbind(Top10, 9-Top10)~. glmfit <- glm(formula=formula,data=MissUSA,family=binomial()) trainfit<-train(form=formula,data=MissUSA,trControl=trainControl(method

Principal Component Analysis with Caret

放肆的年华 提交于 2019-12-10 19:54:02
问题 I'm using Caret's PCI preprocessing. multinomFit <- train(LoanStatus~., train, method = "multinom", std=TRUE, family=binomial, metric = "ROC", thresh = 0.85, verbose = TRUE, pcaComp=7, preProcess=c("center", "scale", "pca"), trControl = ctrl) I specified, the number of PCA Components to be 7. Why does the summary show the fit using 68 components? summary(multinomFit) Call: multinom(formula = .outcome ~ ., data = dat, decay = param$decay, std = TRUE, family = ..2, thresh = 0.85, verbose = TRUE

Access all models produced by rfe in caret

﹥>﹥吖頭↗ 提交于 2019-12-10 17:52:12
问题 This question was migrated from Cross Validated because it can be answered on Stack Overflow. Migrated 7 years ago . I'm using the rfe function in the caret package to do feature selection for logistic regression model. I'm looking at sizes of 5, 10, 15, 20, and 25 selecting the best model using Rsquared (my dependent variable is 0,1). Is there a way to access the other models produced by the rfe function beyond the final selected model? 回答1: There is no automatic way. The best thing you can