Computing degree of similarity among a group of sets

对着背影说爱祢 提交于 2021-02-17 16:58:25

问题


Suppose there are 4 sets:

s1={1,2,3,4};
s2={2,3,4};
s3={2,3,4,5};
s4={1,3,4,5};

Is there any standard metric to present the similarity degree of this group of 4 sets?

Thank you for the suggestion of Jaccard method. However, it seems pairwise. How can I compute the similarity degree of the whole group of sets?


回答1:


Pairwise, you can compute the Jaccard distance of two sets. It's simply the distance between two sets, if they were vectors of booleans in a space where {1, 2, 3…} are all unit vectors.




回答2:


Your question isn't very specific. But I suppose you mean something like the "edit distance" between them? I.e. how much you need to change s1 to get to s2?

Check out the Wikipedia article on Edit distance.




回答3:


As Tobu said I'd use the Jaccard Index which is just the intersection divided by the union of the sets.




回答4:


you could compute the size of the intersection between each set




回答5:


You could compute the Euclidean distance between them, and build a dendrogram from that to visualize similarity.



来源:https://stackoverflow.com/questions/2035326/computing-degree-of-similarity-among-a-group-of-sets

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!