How to split/partition a dataset into training and test datasets for, e.g., cross validation?

前端未结

关注

 12  2056

醉话见心 2020-11-27 10:42

What is a good way to split a NumPy array randomly into training and testing/validation dataset? Something similar to the cvpartition or crossvalind

12条回答

甜味超标 (楼主)

2020-11-27 11:28

Here is a code to split the data into n=5 folds in a stratified manner

% X = data array
% y = Class_label
from sklearn.cross_validation import StratifiedKFold
skf = StratifiedKFold(y, n_folds=5)
for train_index, test_index in skf:
    print("TRAIN:", train_index, "TEST:", test_index)
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]

0 讨论(0)

查看其它12个回答