What is a good way to split a NumPy array randomly into training and testing/validation dataset? Something similar to the cvpartition or crossvalind
Here is a code to split the data into n=5 folds in a stratified manner
% X = data array
% y = Class_label
from sklearn.cross_validation import StratifiedKFold
skf = StratifiedKFold(y, n_folds=5)
for train_index, test_index in skf:
print("TRAIN:", train_index, "TEST:", test_index)
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]