calculate precision and recall in a confusion matrix

后端 未结 4 921
孤街浪徒
孤街浪徒 2021-01-05 13:48

Suppose I have a confusion matrix as like as below. How can I calculate precision and recall?

相关标签:
4条回答
  • 2021-01-05 14:09

    Given:

    hypothetical confusion matrix (cm)

    cm = 
    [[ 970    1    2    1    1    6   10    0    5    0]
     [   0 1105    7    3    1    6    0    3   16    0]
     [   9   14  924   19   18    3   13   12   24    4]
     [   3   10   35  875    2   34    2   14   19   19]
     [   0    3    6    0  903    0    9    5    4   32]
     [   9    6    4   28   10  751   17    5   24    9]
     [   7    2    6    0    9   13  944    1    7    0]
     [   3   11   17    3   16    3    0  975    2   34]
     [   5   38   10   16    7   28    5    4  830   20]
     [   5    3    5   13   39   10    2   34    5  853]]
    

    Goal:

    precision and recall for each class using map() to calculate list division.

    from operator import truediv
    import numpy as np
    
    tp = np.diag(cm)
    prec = list(map(truediv, tp, np.sum(cm, axis=0)))
    rec = list(map(truediv, tp, np.sum(cm, axis=1)))
    print ('Precision: {}\nRecall: {}'.format(prec, rec))
    

    Result:

    Precision: [0.959, 0.926, 0.909, 0.913, 0.896, 0.880, 0.941, 0.925, 0.886, 0.877]
    Recall:    [0.972, 0.968, 0.888, 0.863, 0.937, 0.870, 0.954, 0.916, 0.861, 0.880]
    

    please note: 10 classes, 10 precisions and 10 recalls.

    0 讨论(0)
  • 2021-01-05 14:13

    first, your matrix is arranged upside down. You want to arrange your labels so that true positives are set on the diagonal [(0,0),(1,1),(2,2)] this is the arrangement that you're going to find with confusion matrices generated from sklearn and other packages.

    Once we have things sorted in the right direction, we can take a page from this answer and say that:

    1. True Positives are on the diagonal position
    2. False positives are column-wise sums. Without the diagonal
    3. False negatives are row-wise sums. Without the diagonal.

    \ Then we take some formulas from sklearn docs for precision and recall. And put it all into code:

    import numpy as np
    cm = np.array([[2,1,0], [3,4,5], [6,7,8]])
    true_pos = np.diag(cm)
    false_pos = np.sum(cm, axis=0) - true_pos
    false_neg = np.sum(cm, axis=1) - true_pos
    
    precision = np.sum(true_pos / (true_pos + false_pos))
    recall = np.sum(true_pos / (true_pos + false_neg))
    

    Since we remove the true positives to define false_positives/negatives only to add them back... we can simplify further by skipping a couple of steps:

     true_pos = np.diag(cm) 
     precision = np.sum(true_pos / np.sum(cm, axis=0))
     recall = np.sum(true_pos / np.sum(cm, axis=1))
    
    0 讨论(0)
  • 2021-01-05 14:25

    For the sake of completeness for future reference, given a list of grounth (gt) and prediction (pd). The following code snippet computes confusion matrix and then calculates precision and recall.

    from sklearn.metrics import confusion_matrix
    
    gt = [1,1,2,2,1,0]
    pd = [1,1,1,1,2,0]
    
    cm = confusion_matrix(gt, pd)
    
    #rows = gt, col = pred
    
    #compute tp, tp_and_fn and tp_and_fp w.r.t all classes
    tp_and_fn = cm.sum(1)
    tp_and_fp = cm.sum(0)
    tp = cm.diagonal()
    
    precision = tp / tp_and_fp
    recall = tp / tp_and_fn
    
    0 讨论(0)
  • 2021-01-05 14:27

    I don't think you need summation at last. Without summation, your method is correct; it gives precision and recall for each class.

    If you intend to calculate average precision and recall, then you have two options: micro and macro-average.

    Read more here http://scikit-learn.org/stable/auto_examples/model_selection/plot_precision_recall.html

    0 讨论(0)
提交回复
热议问题