cross-validation | 易学教程

Cross validation in deep neural networks

阅读更多关于 Cross validation in deep neural networks

问题 How do you perform cross-validation in a deep neural network? I know that to perform cross validation to will train it on all folds except one and test it on the excluded fold. Then do this for k fold times and average the accuries for each fold. How do you do this for each iteration. Do you update the parameters at each fold? Or you perform k-fold cross validation for each iteration? Or is each training on all folds but one fold considered as one iteration? 回答1: Cross-validation is a general

TypeError: 'KFold' object is not iterable

阅读更多关于 TypeError: 'KFold' object is not iterable

问题 I'm following one of the kernels on Kaggle, mainly, I'm following A kernel for Credit Card Fraud Detection. I reached the step where I need to perform KFold in order to find the best parameters for Logistic Regression. The following code is shown in the kernel itself, but for some reason (probably older version of scikit-learn, give me some errors). def printing_Kfold_scores(x_train_data,y_train_data): fold = KFold(len(y_train_data),5,shuffle=False) # Different C parameters c_param_range = [0

How i can extracte x_train and y_train from train_generator?

阅读更多关于 How i can extracte x_train and y_train from train_generator?

问题 In my CNN model I want to extract X_train and y_train from train_generator. I want to use ensemble learning, bagging and boosting to evaluate the model. the main challenge is how i can extract X_train and y_train from train_generator using python language. history=model.fit_generator(train_generator, steps_per_epoch=num_of_train_samples // batch_size, epochs=10, validation_data=validation_generator, validation_steps=num_of_val_samples // batch_size, callbacks=callbacks) 回答1: Well, first of

Cross-validation metrics in scikit-learn for each data split

阅读更多关于 Cross-validation metrics in scikit-learn for each data split

问题 Please, I just need to get the cross-validation statistics explicitly for each split of the (X_test, y_test) data. So, to try to do so I did: kf = KFold(n_splits=n_splits) X_train_tmp = [] y_train_tmp = [] X_test_tmp = [] y_test_tmp = [] mae_train_cv_list = [] mae_test_cv_list = [] for train_index, test_index in kf.split(X_train): for i in range(len(train_index)): X_train_tmp.append(X_train[train_index[i]]) y_train_tmp.append(y_train[train_index[i]]) for i in range(len(test_index)): X_test

Cross-validation metrics in scikit-learn for each data split

阅读更多关于 Cross-validation metrics in scikit-learn for each data split

How to calculate cross validation error for ridge regression model?

阅读更多关于 How to calculate cross validation error for ridge regression model?

问题 I am trying to fit a ridge regression model on the white wine dataset. I want to use the entire dataset for training and use 10 fold CV for calculating the test error rate. Thats the main question - how to calculate CV test error for a ridge regressed logistic model. I calculated the best value of lambda (also using CV ), and now I want to find the CV test error rate. Currently, my code for calculating the said CV test error is - cost1 <- function(good, pi=0) mean(abs(good-pi) > 0.5) ridge

Why can't I use cv.glm on the output of bestglm?

阅读更多关于 Why can't I use cv.glm on the output of bestglm?

问题 I am trying to do best subset selection on the wine dataset, and then I want to get the test error rate using 10 fold CV. The code I used is - cost1 <- function(good, pi=0) mean(abs(good-pi) > 0.5) res.best.logistic <- bestglm(Xy = winedata, family = binomial, # binomial family for logistic IC = "AIC", # Information criteria method = "exhaustive") res.best.logistic$BestModels best.cv.err<- cv.glm(winedata,res.best.logistic$BestModel,cost1, K=10) However, this gives the error - Error in

Put customized functions in Sklearn pipeline

阅读更多关于 Put customized functions in Sklearn pipeline

问题 In my classification scheme, there are several steps including: SMOTE (Synthetic Minority Over-sampling Technique) Fisher criteria for feature selection Standardization (Z-score normalisation) SVC (Support Vector Classifier) The main parameters to be tuned in the scheme above are percentile (2.) and hyperparameters for SVC (4.) and I want to go through grid search for tuning. The current solution builds a "partial" pipeline including step 3 and 4 in the scheme clf = Pipeline([('normal'

Put customized functions in Sklearn pipeline

阅读更多关于 Put customized functions in Sklearn pipeline

How to get roc auc for binary classification in sklearn

阅读更多关于 How to get roc auc for binary classification in sklearn

问题 I have binary classification problem where I want to calculate the roc_auc of the results. For this purpose, I did it in two different ways using sklearn. My code is as follows. Code 1: from sklearn.metrics import make_scorer from sklearn.metrics import roc_auc_score myscore = make_scorer(roc_auc_score, needs_proba=True) from sklearn.model_selection import cross_validate my_value = cross_validate(clf, X, y, cv=10, scoring = myscore) print(np.mean(my_value['test_score'].tolist())) I get the