clustering on very large sparse matrix?

戏子无情 提交于 2019-12-01 00:11:56

For your case, I guess your problem is only in the size of the input.

I would suggest "cluto" as a good tool for large and sparse dataset. It is written in C. I have tried around 17 millions of rows with around 400 cols. And it works fast.

Link of the Cluto library

You can try sparcl package in R, it implements sparse k-means and hierarchical clustering. Not so easy to understand tough

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!