How to split data into trainset and testset randomly?

后端 未结 9 1293
花落未央
花落未央 2020-12-07 16:27

I have a large dataset and want to split it into training(50%) and testing set(50%).

Say I have 100 examples stored the input file, each line contains one example.

9条回答
  •  自闭症患者
    2020-12-07 17:25

    You could also use numpy. When your data is stored in a numpy.ndarray:

    import numpy as np
    from random import sample
    l = 100 #length of data 
    f = 50  #number of elements you need
    indices = sample(range(l),f)
    
    train_data = data[indices]
    test_data = np.delete(data,indices)
    

提交回复
热议问题