What is a good way to split a NumPy array randomly into training and testing/validation dataset? Something similar to the cvpartition
or crossvalind
Just a note. In case you want train, test, AND validation sets, you can do this:
from sklearn.cross_validation import train_test_split
X = get_my_X()
y = get_my_y()
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
x_test, x_val, y_test, y_val = train_test_split(x_test, y_test, test_size=0.5)
These parameters will give 70 % to training, and 15 % each to test and val sets. Hope this helps.