How can you compare two cluster groupings in terms of similarity or overlap in Python?

佐手、 提交于 2019-12-01 19:08:37

Use evaluation metrics.

Many metrics are symmetric. For example, the adjusted Rand index.

A value close to 1 means they are very similar, close to 0 is random, and much less than 0 means each cluster of one is "evenly" distributed over all clusters of the other.

Well, determining the number of clusters is problem in data analysis and different issue from clustering problem itself. There are quite a few criteria for this AIC or Cubic Clustering Criteria. I think that with scikit-learn there is not an option to calculate these by default two but I know that there are packages in R.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!