Effective clustering of a similarity matrix

喜欢而已 提交于 2019-12-03 07:53:25

Since you're both new to the field, have an unknown number of clusters and are already using cosine distance I would recommend the FLAME clustering algorithm.

It's intuitive, easy to implement, and has implementations in a large number of languages (not PHP though, largely because very few people use PHP for data science).

Not to mention, it's actually good enough to be used in research by a large number of people. If nothing else you can get an idea of what exactly the shortcomings are in this clustering algorithm that you want to address in moving onto another one.

Just try some. There are so many clustering algorithms out there, nobody will know all of them. Plus, it also depends a lot on your data set and the clustering structure that is there. In the end, there also may be just this one monster cluster with respect to cosine distance and BofW features.

Maybe you can transform your similarity matrix to a dissimilarity matrix such as transforming x to 1/x, then your problem is to cluster a dissimilarity matrix. I think the hierarchical cluster may work. These may help you:hierarchical clustering and Clustering a dissimilarity matrix

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!