How to split/partition a dataset into training and test datasets for, e.g., cross validation?

前端 未结 12 2013
醉话见心
醉话见心 2020-11-27 10:42

What is a good way to split a NumPy array randomly into training and testing/validation dataset? Something similar to the cvpartition or crossvalind

12条回答
  •  执念已碎
    2020-11-27 11:27

    There is another option that just entails using scikit-learn. As scikit's wiki describes, you can just use the following instructions:

    from sklearn.model_selection import train_test_split
    
    data, labels = np.arange(10).reshape((5, 2)), range(5)
    
    data_train, data_test, labels_train, labels_test = train_test_split(data, labels, test_size=0.20, random_state=42)
    

    This way you can keep in sync the labels for the data you're trying to split into training and test.

提交回复
热议问题