Specific number of test/train size for each class in sklearn
问题 Data: import pandas as pd data = pd.DataFrame({'classes':[1,1,1,2,2,2,2],'b':[3,4,5,6,7,8,9], 'c':[10,11,12,13,14,15,16]}) My code: import numpy as np from sklearn.cross_validation import train_test_split X = np.array(data[['b','c']]) y = np.array(data['classes']) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=4) Question: train_test_split will randomly choose test set from all the classes. Is there any way to have the same number of test set for each class ? (For example