Text clustering using Scipy Hierarchy Clustering in Python

ぐ巨炮叔叔 提交于 2019-11-30 16:32:33

You can do the following:

  1. Align your results (your clustering variable) with your input (the 1000+ articles).
  2. Using pandas library, you can use a groupby function with the cluster # as its key.
  3. Per group (using the get_group function), fill up a defaultdict of integers for every word you encounter.
  4. You can now sort the dictionary of word counts in descending order and get your desired number of most frequent words.

Good luck with what you're doing and please do accept my answer if it's what you're looking for.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!