scikit-learn random state in splitting dataset

后端 未结 9 1958
无人共我
无人共我 2020-12-05 09:12

Can anyone tell me why we set random state to zero in splitting train and test set.

X_train, X_test, y_train, y_test = \\
    train_test_split(X, y, test_size         


        
9条回答
  •  执念已碎
    2020-12-05 09:47

    The random_state is an integer value which implies the selection of a random combination of train and test. When you set the test_size as 1/4 the there is a set generated of permutation and combination of train and test and each combination has one state. Suppose you have a dataset---> [1,2,3,4]

    Train   |  Test   | State
    [1,2,3]    [4]      **0**
    [1,3,4]    [2]      **1**
    [4,2,3]    [1]      **2**
    [2,4,1]    [3]      **3**
    

    We need it because while param tuning of model same state will considered again and again. So that there won't be any inference with the accuracy.

    But in case of Random forest there is also similar story but in a different way w.r.t the variables.

提交回复
热议问题