How to split/partition a dataset into training and test datasets for, e.g., cross validation?

前端 未结 12 2006
醉话见心
醉话见心 2020-11-27 10:42

What is a good way to split a NumPy array randomly into training and testing/validation dataset? Something similar to the cvpartition or crossvalind

12条回答
  •  感情败类
    2020-11-27 11:30

    Just a note. In case you want train, test, AND validation sets, you can do this:

    from sklearn.cross_validation import train_test_split
    
    X = get_my_X()
    y = get_my_y()
    x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
    x_test, x_val, y_test, y_val = train_test_split(x_test, y_test, test_size=0.5)
    

    These parameters will give 70 % to training, and 15 % each to test and val sets. Hope this helps.

提交回复
热议问题