Inter-rater reliability calculation for multi-raters data

白昼怎懂夜的黑 提交于 2019-12-11 06:38:22

问题


I have the following list of lists:

[[1, 1, 1, 1, 3, 0, 0, 1],
 [1, 1, 1, 1, 3, 0, 0, 1],
 [1, 1, 1, 1, 2, 0, 0, 1],
 [1, 1, 0, 2, 3, 1, 0, 1]]

Where I want to calculate an inter-rater reliability score, there are multiple raters(rows). I cannot use Fleiss' kappa, since the rows do not sum to the same number. What is a good approach in this case?


回答1:


The basic problem here is that you have not properly applied the data you're given. See here for the proper organization. You have four categories (ratings 0-3) and eight subjects. Thus, your table must have eight rows and four columns, regardless of the quantity of reviewers. For instance, the top row is the tally of ratings given to the first item:

[0, 4, 0, 0]   ... since everyone rated it a `1`.

Your -inf value is from dividing by 0 on the P[j] score for the penultimate column.


My earlier answer, normalizing the scores, was based on my misinterpretation of Fleiss; I had a different reliability in mind. There are many ways to compute such a metric; one is consistency of relative rating points (which you can get with normalization); another is to convert each rater's row into a graph of relative rankings, and compute a similarity among those graphs.

Note that Fleiss is not perfectly applicable to a rating situation with a relative metric: it assumes that this is a classification task, not a ranking. Fleiss is not sensitive to how far apart the ratings are; it knows only that the ratings differed: a (0,1) paring is just as damaging as a (0,3) pairing.




回答2:


The answer to this problem was to use krippendorff alpha score:

Wikipedia Description

Python Library

import krippendorff

arr = [[1, 1, 1, 1, 3, 0, 0, 1],
       [1, 1, 1, 1, 3, 0, 0, 1],
       [1, 1, 1, 1, 2, 0, 0, 1],
       [1, 1, 0, 2, 3, 1, 0, 1]]    
res = krippendorff.alpha(arr)


来源:https://stackoverflow.com/questions/56481245/inter-rater-reliability-calculation-for-multi-raters-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!