Scoring metrics from Keras scikit-learn wrapper in cross validation with one-hot encoded labels

后端 未结 3 1023
刺人心
刺人心 2021-01-26 07:45

I am implementing a neural network and I would like to assess its performance with cross validation. Here is my current code:

def recall_m(y_true, y_pred):
    t         


        
3条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-01-26 08:31

    I've been experimenting with @desertnaut 's answer however because I have a multi class problem, I experienced problems not even with the loop itself but the np.argmax() line. After googling I did not find any way to resolve it easily so I ended up (on this user's advice) implementing CV by hand. It was a bit more complicated because I am using a pandas dataframe (and the code can definitely be cleaned up further) but here is the working code:

    ep = 120
    df_split = np.array_split(df, 10)
    test_part = 0
    acc = []
    f1 = []
    prec = []
    recalls = []
    while test_part < 10:
        model = build_model()
        train_x = []
        train_y = []
        test_x = []
        test_y = []
        print("CV Fold, with test partition i = " , test_part)
    
        for i in range(10):
            #on first iter that isnt a test part then set the train set to this 
            if len(train_x) == 0 and not i == test_part:
                train_x = df_split[i][['start-sin', 'start-cos', 'start-sin-lag', 'start-cos-lag', 'prev-close-sin', 'prev-close-cos', 'prev-length', 'state-lag', 'monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday', 'sunday']]
                train_y = df_split[i][['wait-categ-none', 'wait-categ-short', 'wait-categ-medium', 'wait-categ-long']]
                #terminate immediately
                continue
            #if current is not a test partition then concat with previous version
            if not i == test_part:
                train_x = pd.concat([train_x, df_split[i][['start-sin', 'start-cos', 'start-sin-lag', 'start-cos-lag', 'prev-close-sin', 'prev-close-cos', 'prev-length', 'state-lag', 'monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday', 'sunday']]], axis=0)
                train_y = pd.concat([train_y, df_split[i][['wait-categ-none', 'wait-categ-short', 'wait-categ-medium', 'wait-categ-long']]], axis=0)
    
            #set this to test partition
            else:
                test_x = df_split[i][['start-sin', 'start-cos', 'start-sin-lag', 'start-cos-lag', 'prev-close-sin', 'prev-close-cos', 'prev-length', 'state-lag', 'monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday', 'sunday']]
                test_y = df_split[i][['wait-categ-none', 'wait-categ-short', 'wait-categ-medium', 'wait-categ-long']]
        #enforce
        train_y = train_y.replace(False, 0)
        train_y = train_y.replace(True, 1)
        test_y = test_y.replace(False, 0)
        test_y = test_y.replace(True, 1)
        #fit
        model.fit(train_x, train_y, epochs=ep, verbose=1)
        pred = model.predict(test_x)
        #score
        loss, accuracy, f1_score, precision, recall = model.evaluate(test_x, test_y, verbose=0)
        #save
        acc.append(accuracy)
        f1.append(f1_score)
        prec.append(precision)
        recalls.append(recall)
        test_part += 1
    print("CV finished.\n")
    
    print("Mean Accuracy")
    print(sum(acc)/len(acc))
    print("Mean F1 score")
    print(sum(f1)/len(f1))
    print("Mean Precision")
    print(sum(prec)/len(prec))
    print("Mean Recall rate")
    print(sum(recalls)/len(recalls))
    

提交回复
热议问题