发表新帖

发表新帖

using confusion matrix as scoring metric in cross validation in scikit learn

前端未结

关注

 5  653

野性不改 2021-01-31 11:15

I am creating a pipeline in scikit learn,

pipeline = Pipeline([
    (\'bow\', CountVectorizer()),  
    (\'classifier\', BernoulliNB()), 
])

a

5条回答

忘了有多久 (楼主)

2021-01-31 11:55
I am new to machine learning. If I understand correctly, the confusion matrix can obtain from 4 value, which are TP, FN, FP and TN. Those 4 value cannot obtain directly from scoring, but it is implied in accuracy, precision and recall.

Now it has 4 unknown TP, FN, FP and TN.

Eq1 : tp/(tp+fp)=P

Eq2 : tp/(tp+fn)=R

Eq3 : (tp+tn)/(tp+fn+fp+tn)=A
```
[1]: https://chart.googleapis.com/chart?cht=tx&chl=%5Cfrac%7Btp%7D%7Btp%2Bfp%7D%3DP
[2]: https://chart.googleapis.com/chart?cht=tx&chl=%5Cfrac%7Btp%7D%7Btp%2Bfn%7D%3DR
[3]: https://chart.googleapis.com/chart?cht=tx&chl=%5Cfrac%7Btp%2Btn%7D%7Btp%2Bfn%2Bfp%2Btn%7D%3DA
```
Assuming one of the unknown is 1, then it becomes 3 unknown and 3 equations. The relative value can be solved using system of equation.
1. P R A can obtain from scoring
2. cross_validate can get all 3 source at one time
```
def calculate_confusion_matrix_by_assume_tp_equal_to_1(r, p, a):
    # tp/(tp+fp)=P, tp/(tp+fn)=R, (tp+tn)/(tp+fn+fp+tn)=A
    fn = (1 / r) - 1
    fp = (1 / p) - 1
    tn = (1 - a - a * fn - a * fp) / (a - 1)
    return fn, fp, tn
```
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...

热议问题