cross-validation | 易学教程

PCA within cross validation; however, only with a subset of variables

阅读更多关于 PCA within cross validation; however, only with a subset of variables

问题 This question is very similar to preprocess within cross-validation in caret; however, in a project that i'm working on I would only like to do PCA on three predictors out of 19 in my case. Here is the example from preprocess within cross-validation in caret and I'll use this data ( PimaIndiansDiabetes ) for ease (this is not my project data but concept should be the same). I would then like to do the preProcess only on a subset of variables i.e. PimaIndiansDiabetes[, c(4,5,6)]. Is there a

PCA within cross validation; however, only with a subset of variables

阅读更多关于 PCA within cross validation; however, only with a subset of variables

TypeError: only integer scalar arrays can be converted to a scalar index , while trying kfold cv

阅读更多关于 TypeError: only integer scalar arrays can be converted to a scalar index , while trying kfold cv

问题 Trying to perform Kfold cv on a dataset containing 279 files , the files are of shape ( 279 , 5 , 90) after performing a k-means. I reshaped it in order to fit it on a svm. Now the shape is ( 279, 5*90 ) . Trying the Kfold cv approach gives me the error "TypeError: only integer scalar arrays can be converted to a scalar index " #input with open("dataset.pkl", "rb") as file: dataset = pkl.load(file) print(len(dataset)) x = [i[0] for i in dataset] #k-means cc y = [i[1] for i in dataset] #label

R: can caret::train function for glmnet cross-validate AUC at fixed alpha and lambda?

阅读更多关于 R: can caret::train function for glmnet cross-validate AUC at fixed alpha and lambda?

问题 I would like to calculate the 10-fold cross-validated AUC of an elastic net regression model with the optimal alpha and lambda using caret::train https://stats.stackexchange.com/questions/69638/does-caret-train-function-for-glmnet-cross-validate-for-both-alpha-and-lambda/69651 explains how to cross-validate alpha and lambda with caret::train My question on Cross Validated got closed, because it has been classified as a programming question: https://stats.stackexchange.com/questions/505865/r

sklearn use RandomizedSearchCV with custom metrics and catch Exceptions

阅读更多关于 sklearn use RandomizedSearchCV with custom metrics and catch Exceptions

问题 I am using the RandomizedSearchCV function in sklearn with a Random Forest Classifier. To see different metrics i am using a custom scoring from sklearn.metrics import make_scorer, roc_auc_score, recall_score, matthews_corrcoef, balanced_accuracy_score, accuracy_score acc = make_scorer(accuracy_score) auc_score = make_scorer(roc_auc_score) recall = make_scorer(recall_score) mcc = make_scorer(matthews_corrcoef) bal_acc = make_scorer(balanced_accuracy_score) scoring = {"roc_auc_score": auc

Caret and GBM: task 1 failed - “arguments imply differing number of rows”

阅读更多关于 Caret and GBM: task 1 failed - “arguments imply differing number of rows”

问题 I'm trying to run a GBM with caret with the code below: library(caret) library(doParallel) detectCores() registerDoParallel(detectCores() - 1) set.seed(668) in.train <- createDataPartition(y = dat$target, p = 0.80, list = T) ctrl <- trainControl(method = 'cv', number = 2, classProbs = T, verboseIter = T, summaryFunction = LogLossSummary2) gbm.grid <- expand.grid(interaction.depth = 10, n.trees = (2:7) * 50, shrinkage = 0.1) Sys.time() set.seed(1234) gbm.fit <- train(target ~., data = otto.new

Using sklearn's RandomizedSearchCV with SMOTE oversampling only on training folds

阅读更多关于 Using sklearn's RandomizedSearchCV with SMOTE oversampling only on training folds

问题 I have a highly unbalanced dataset (99.5:0.5). I would like to perform hyperparameter tuning on a Random Forest model using sklearn 's RandomizedSearchCV . I would like each of the training folds to be oversampled using SMOTE, and then each of the tests to be evaluated on the final fold, keeping the original distribution without any oversampling. Since these test folds are highly unbalanced, I would like the tests to be evaluated using the F1 Score. I have tried the following: from sklearn

In Keras “ImageDataGenerator”, is “validation_split” parameter a kind of K-fold cross validation?

阅读更多关于 In Keras “ImageDataGenerator”, is “validation_split” parameter a kind of K-fold cross validation?

问题 I am trying to do K-fold cross validation on Keras model (with ImageDataGenerator and flow_from_directory for training and validation data), I want to know if the argument "validation_split" in "ImageDataGenerator" test_datagen = ImageDataGenerator( rescale=1. / 255, rotation_range = 180, width_shift_range = 0.2, height_shift_range = 0.2, brightness_range = (0.8, 1.2), shear_range = 0.2, zoom_range = 0.2, horizontal_flip = True, vertical_flip = True, validation_split = 0.1 ) train_datagen =

How to compute precision,recall and f1 score of an imbalanced dataset for K fold cross validation with 10 folds in python

阅读更多关于 How to compute precision,recall and f1 score of an imbalanced dataset for K fold cross validation with 10 folds in python

问题 I have an imbalanced dataset containing binary classification problem.I have built Random Forest Classifier and used k fold cross validation with 10 folds. kfold = model_selection.KFold(n_splits=10, random_state=42) model=RandomForestClassifier(n_estimators=50) I got the results of the 10 folds results = model_selection.cross_val_score(model,features,labels, cv=kfold) print results [ 0.60666667 0.60333333 0.52333333 0.73 0.75333333 0.72 0.7 0.73 0.83666667 0.88666667] I have calculated

How to compute precision,recall and f1 score of an imbalanced dataset for K fold cross validation with 10 folds in python

阅读更多关于 How to compute precision,recall and f1 score of an imbalanced dataset for K fold cross validation with 10 folds in python