How to split/partition a dataset into training and test datasets for, e.g., cross validation?

前端未结

关注

 12  2055

醉话见心 2020-11-27 10:42

What is a good way to split a NumPy array randomly into training and testing/validation dataset? Something similar to the cvpartition or crossvalind

12条回答

执念已碎 (楼主)

2020-11-27 11:27
There is another option that just entails using scikit-learn. As scikit's wiki describes, you can just use the following instructions:
```
from sklearn.model_selection import train_test_split

data, labels = np.arange(10).reshape((5, 2)), range(5)

data_train, data_test, labels_train, labels_test = train_test_split(data, labels, test_size=0.20, random_state=42)
```
This way you can keep in sync the labels for the data you're trying to split into training and test.
0 讨论(0)

查看其它12个回答
发布评论:

提交评论
- 加载中...