How to split/partition a dataset into training and test datasets for, e.g., cross validation?

前端 未结 12 2056
醉话见心
醉话见心 2020-11-27 10:42

What is a good way to split a NumPy array randomly into training and testing/validation dataset? Something similar to the cvpartition or crossvalind

12条回答
  •  甜味超标
    2020-11-27 11:28

    Here is a code to split the data into n=5 folds in a stratified manner

    % X = data array
    % y = Class_label
    from sklearn.cross_validation import StratifiedKFold
    skf = StratifiedKFold(y, n_folds=5)
    for train_index, test_index in skf:
        print("TRAIN:", train_index, "TEST:", test_index)
        X_train, X_test = X[train_index], X[test_index]
        y_train, y_test = y[train_index], y[test_index]
    

提交回复
热议问题