How to split data into trainset and testset randomly?

后端 未结 9 1295
花落未央
花落未央 2020-12-07 16:27

I have a large dataset and want to split it into training(50%) and testing set(50%).

Say I have 100 examples stored the input file, each line contains one example.

9条回答
  •  攒了一身酷
    2020-12-07 17:21

    sklearn.cross_validation is deprecated since version 0.18, instead you should use sklearn.model_selection as show below

    from sklearn.model_selection import train_test_split
    import numpy
    
    with open("datafile.txt", "rb") as f:
       data = f.read().split('\n')
       data = numpy.array(data)  #convert array to numpy type array
    
       x_train ,x_test = train_test_split(data,test_size=0.5)       #test_size=0.5(whole_data)
    

提交回复
热议问题