Affinity Propagation (sklearn) - strange behavior

问题

Trying to use affinity propagation for a simple clustering task:

from sklearn.cluster import AffinityPropagation
c = [[0], [0], [0], [0], [0], [0], [0], [0]]
af = AffinityPropagation (affinity = 'euclidean').fit (c)
print (af.labels_)

I get this strange result: [0 1 0 1 2 1 1 0]

I would expect to have all samples in the same cluster, like in this case:

c = [[0], [0], [0]]
af = AffinityPropagation (affinity = 'euclidean').fit (c)
print (af.labels_)

which indeed puts all samples in the same cluster: [0 0 0]

What am I missing?

Thanks

回答1:

I believe this is because your problem is essentially ill-posed (you pass lots of the same point to an algorithm which is trying to find similarity between different points). AffinityPropagation is doing matrix math under the hood, and your similarity matrix (which is all zeros) is nastily degenerate. In order to not error out, the implementation adds a small random matrix to the similarity matrix, preventing the algorithm from quitting when it encounters two of the same point.

来源：https://stackoverflow.com/questions/30829917/affinity-propagation-sklearn-strange-behavior

标签

scikit-learn

cluster-analysis

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!