How to calculate ROC_AUC score having 3 classes

帅比萌擦擦* 提交于 2021-02-11 13:30:04

问题


I have a data having 3 class labels(0,1,2). I tried to make ROC curve. and did it by using pos_label parameter.

fpr, tpr, thresholds = metrics.roc_curve(Ytest, y_pred_prob, pos_label = 0)

By changing pos_label to 0,1,2- I get 3 graphs, Now I am having issue in calculating AUC score. How can I average the 3 graphs and plot 1 graph from it and then calculate the Roc_AUC score. i am having error in by this metrics.roc_auc_score(Ytest, y_pred_prob)

ValueError: multiclass format is not supported

please help me.

# store the predicted probabilities for class 0
y_pred_prob = cls.predict_proba(Xtest)[:, 0]
#first argument is true values, second argument is predicted probabilities
fpr, tpr, thresholds = metrics.roc_curve(Ytest, y_pred_prob, pos_label = 0)
plt.plot(fpr, tpr)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.0])
plt.title('ROC curve classifier')
plt.xlabel('False Positive Rate (1 - Specificity)')
plt.ylabel('True Positive Rate (Sensitivity)')
plt.grid(True)

# store the predicted probabilities for class 1
y_pred_prob = cls.predict_proba(Xtest)[:, 1]
#first argument is true values, second argument is predicted probabilities
fpr, tpr, thresholds = metrics.roc_curve(Ytest, y_pred_prob, pos_label = 0)
plt.plot(fpr, tpr)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.0])
plt.title('ROC curve classifier')
plt.xlabel('False Positive Rate (1 - Specificity)')
plt.ylabel('True Positive Rate (Sensitivity)')

plt.grid(True)

# store the predicted probabilities for class 2
y_pred_prob = cls.predict_proba(Xtest)[:, 2]
#first argument is true values, second argument is predicted probabilities
fpr, tpr, thresholds = metrics.roc_curve(Ytest, y_pred_prob, pos_label = 0)
plt.plot(fpr, tpr)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.0])
plt.title('ROC curve classifier')
plt.xlabel('False Positive Rate (1 - Specificity)')
plt.ylabel('True Positive Rate (Sensitivity)')

plt.grid(True)

from the above code. 3 roc curves are generated. Due to multi-classes.

I want to have a one roc curve from above 3 by taking average or mean. Then, one roc_auc score from that.


回答1:


For multiclass, it is often useful to calculate the AUROC for each class. For example, here's an excerpt from some code I use to calculate AUROC for each class separately, where label_meanings is a list of strings describing what each label is, and the various arrays are formatted such that each row is a different example and each column corresponds to a different label:

for label_number in range(len(label_meanings)):
    which_label = label_meanings[label_number] #descriptive string for the label
    true_labels = true_labels_array[:,label_number]
    pred_probs = pred_probs_array[:,label_number]
   #AUROC and AP (sliding across multiple decision thresholds)
    fpr, tpr, thresholds = sklearn.metrics.roc_curve(y_true = true_labels,
                                     y_score = pred_probs,
                                     pos_label = 1)
    auc = sklearn.metrics.auc(fpr, tpr)

If you want to plot an average AUC curve across your three classes: This code https://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html includes parts that calculate the average AUC so that you can make a plot (if you have three classes, it will plot the average AUC for the three classes.)

If you just want an average AUC across your three classes: once you have calculated the AUC of each class separately you can average the three numbers to get an overall AUC.

If you want more background on AUROC and how it is calculated for single class versus multi class you can see this article, Measuring Performance: AUC (AUROC).




回答2:


Highlights in the multi-class AUC:

You cannot calculate a common AUC for all classes. You must calculate the AUC for each class separately. Just as you have to calculate the recall, precision is separate for each class when making a multi-class classification.

THE SIMPLEST method of calculating the AUC for individual classes:

  1. We choose a classifier

from sklearn.linear_model import LogisticRegression

LRE = LogisticRegression(solver='lbfgs')

LRE.fit(X_train, y_train)
  1. I am making a list of multi-class classes

    d = y_test.unique()

    class_name = list(d.flatten())

    class_name

  2. Now calculate the AUC for each class separately

    for p in class_name:

         `fpr, tpr, thresholds = metrics.roc_curve(y_test,  
                         LRE.predict_proba(X_test)[:,1], pos_label = p) 
    
          auroc = round(metrics.auc(fpr, tpr),2)
          print('LRE',p,'--AUC--->',auroc)`
    



来源:https://stackoverflow.com/questions/56227246/how-to-calculate-roc-auc-score-having-3-classes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!