Stratified Train/Test-split in scikit-learn

后端 未结 7 2177

I need to split my data into a training set (75%) and test set (25%). I currently do that with the code below:

X, Xt, userInfo, userInfo_train = sklearn.cros         


        
7条回答
  •  栀梦
    栀梦 (楼主)
    2020-11-27 03:29

    #train_size is 1 - tst_size - vld_size
    tst_size=0.15
    vld_size=0.15
    
    X_train_test, X_valid, y_train_test, y_valid = train_test_split(df.drop(y, axis=1), df.y, test_size = vld_size, random_state=13903) 
    
    X_train_test_V=pd.DataFrame(X_train_test)
    X_valid=pd.DataFrame(X_valid)
    
    X_train, X_test, y_train, y_test = train_test_split(X_train_test, y_train_test, test_size=tst_size, random_state=13903)
    

提交回复
热议问题