r-caret

How to solve “The data cannot have more levels than the reference” error when using confusioMatrix?

自闭症网瘾萝莉.ら 提交于 2019-12-24 00:42:04
问题 I'm using R programming. I divided the data as train & test for predicting accuracy. This is my code: library("tree") credit<-read.csv("C:/Users/Administrator/Desktop/german_credit (2).csv") library("caret") set.seed(1000) intrain<-createDataPartition(y=credit$Creditability,p=0.7,list=FALSE) train<-credit[intrain, ] test<-credit[-intrain, ] treemod<-tree(Creditability~. , data=train) plot(treemod) text(treemod) cv.trees<-cv.tree(treemod,FUN=prune.tree) plot(cv.trees) prune.trees<-prune.tree

Caret Binary Classification with RMSE

痴心易碎 提交于 2019-12-24 00:30:19
问题 Is there a way to get caret to use RMSE with a binary classification problem? If you try to use metric = "RMSE" with a classification problem you will receive the message: Error in train.default(x, y, weights = w, ...) : Metric RMSE not applicable for classification models Which makes sense. But is there a way to define a custom metric? For example, if your outcome is 0 or 1 , you can define the error as outcome - p where p is the probability predicted by the model. EDIT ====================

How should I get the coefficients of Lasso Model?

佐手、 提交于 2019-12-24 00:19:37
问题 Here is my code: library(MASS) library(caret) df <- Boston set.seed(3721) cv.10.folds <- createFolds(df$medv, k = 10) lasso_grid <- expand.grid(fraction=c(1,0.1,0.01,0.001)) lasso <- train(medv ~ ., data = df, preProcess = c("center", "scale"), method ='lasso', tuneGrid = lasso_grid, trControl= trainControl(method = "cv", number = 10, index = cv.10.folds)) lasso Unlike linear model, I cannot find the coefficients of Lasso regression model from summary(lasso). How should I do that? Or maybe I

Difference in average AUC computation using ROCR and pROC (R)

风格不统一 提交于 2019-12-23 17:32:02
问题 I am working with cross-validation data (10-fold repeated 5 times) from a SVM-RFE model generated with the caret package. I know that caret package works with pROC package when computing metrics but I need to use ROCR package in order to obtain the average ROC. However, I noticed that the average AUC values were not the same when using each package, so I am not sure if I should use both packages indistinctively. The code I used to prove that is: predictions_NG3<-list() labels_NG3<-list()

R: set.seed() results don't match if caret package loaded

大城市里の小女人 提交于 2019-12-23 09:38:33
问题 I am using createFolds() in R (version: 3.3.0) to create train/test partitions. To make results reproducible, I used set.seed() with a seed value of 10. As expected, the results (generated folds) were reproducible. But once I loaded caret package just after setting the seed. And then used the createFolds function, I found that the created folds were different (although still reproducible). Specifically, the created folds differ in the following two cases: Case 1: library(caret) set.seed(10)

caret: using random forest and include cross-validation

为君一笑 提交于 2019-12-22 15:14:04
问题 I used the caret package to train a random forest, including repeated cross-validation. I’d like to know whether the OOB, as in the original RF by Breiman, is used or whether this is replaced by the cross-validation. If it is replaced, do I have the same advantages as described in Breiman 2001, like increased accuracy by reducing the correlation between input data? As OOB is drawn with replacement and CV is drawn without replacement, are both procedures comparable? What is the OOB estimate of

caret: using random forest and include cross-validation

僤鯓⒐⒋嵵緔 提交于 2019-12-22 15:13:07
问题 I used the caret package to train a random forest, including repeated cross-validation. I’d like to know whether the OOB, as in the original RF by Breiman, is used or whether this is replaced by the cross-validation. If it is replaced, do I have the same advantages as described in Breiman 2001, like increased accuracy by reducing the correlation between input data? As OOB is drawn with replacement and CV is drawn without replacement, are both procedures comparable? What is the OOB estimate of

“length of 'dimnames' [1] not equal to array extent” error in linear regression summary in r

偶尔善良 提交于 2019-12-22 08:21:02
问题 I'm running a straightforward linear regression model fit on the following dataframe: > str(model_data_rev) 'data.frame': 128857 obs. of 12 variables: $ ENTRY_4 : num 186 218 208 235 256 447 471 191 207 250 ... $ ENTRY_8 : num 724 769 791 777 707 237 236 726 773 773 ... $ ENTRY_12: num 2853 2989 3174 3027 3028 ... $ ENTRY_16: num 2858 3028 3075 2992 3419 ... $ ENTRY_20: num 7260 7188 7587 7560 7165 ... $ EXIT_4 : num 70 82 105 114 118 204 202 99 73 95 ... $ EXIT_8 : num 1501 1631 1594 1576

r package caret-Print iteration when using parallel

孤街醉人 提交于 2019-12-22 07:00:42
问题 Is there anyway we could print iteration when using the caret::train function in parallel? I know there is option called verbose but it seems it doesn't print anything if I use multicore. 回答1: I've found a solution. All we need is to register cores via makeCluster function. library(doSNOW) cl <- makeCluster(30, outfile="") registerDoSNOW(cl) In this way, the log will be printed in console. I've tested on regular R/ Rstudio/ Rserver in mac/window/ubuntu (even AWS) For example, iris <- iris[1

Can't install the caret package in R (in my Linux machine)

旧时模样 提交于 2019-12-22 05:29:20
问题 I am facing the following errors while trying to install the caret package in R. g++: error: /tmp/Rtmp2Tos7n/R.INSTALL2e6e30153a74/nloptr/nlopt-2.4.2/lib/libnlopt_cxx.a: No such file or directory make: *** [nloptr.so] Error 1 ERROR: compilation failed for package ‘nloptr’ * removing ‘/rmt/csfiles/pgrads/mava290/R/x86_64-suse-linux-gnu-library/3.1/nloptr’ Warning in install.packages : installation of package ‘nloptr’ had non-zero exit status ERROR: dependency ‘nloptr’ is not available for