How can I analyze a confusion matrix?

前端未结

关注

 3  552

北恋 2021-01-19 05:04

When I print out scikit-learn\'s confusion matrix, I receive a very huge matrix. I want to analyze what are the true positives, true negatives etc. How can I do so? This is

3条回答

情书的邮戳 (楼主)

2021-01-19 05:55
Approach 1: Binary Classification
```
from sklearn.metrics import confusion_matrix as cm
import pandas as pd

y_test = [1, 0, 0]
y_pred = [1, 0, 0]
confusion_matrix=cm(y_test, y_pred)

list1 = ["Actual 0", "Actual 1"]
list2 = ["Predicted 0", "Predicted 1"]
pd.DataFrame(confusion_matrix, list1,list2)
```
Approach 2: Multiclass Classification

While sklearn.metrics.confusion_matrix provides a numeric matrix, you can generate a 'report' using the following:
```
import pandas as pd
y_true = pd.Series([2, 0, 2, 2, 0, 1, 1, 2, 2, 0, 1, 2])
y_pred = pd.Series([0, 0, 2, 1, 0, 2, 1, 0, 2, 0, 2, 2])

pd.crosstab(y_true, y_pred, rownames=['True'], colnames=['Predicted'], margins=True)
```
which results in:
```
Predicted  0  1  2  All
True                   
0          3  0  0    3
1          0  1  2    3
2          2  1  3    6
All        5  2  5   12
```
This allows us to see that:
1. The diagonal elements show the number of correct classifications for each class: 3, 1 and 3 for the classes 0, 1 and 2.
2. The off-diagonal elements provides the misclassifications: for example, 2 of the class 2 were misclassified as 0, none of the class 0 were misclassified as 2, etc.
3. The total number of classifications for each class in both y_true and y_pred, from the "All" subtotals
This method also works for text labels, and for a large number of samples in the dataset can be extended to provide percentage reports.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...