Text clustering with Levenshtein distances

后端 未结 4 1320
暖寄归人
暖寄归人 2020-11-30 22:52

I have a set (2k - 4k) of small strings (3-6 characters) and I want to cluster them. Since I use strings, previous answers on How does clustering (especially String clusteri

4条回答
  •  情歌与酒
    2020-11-30 23:36

    ELKI includes Levenshtein distance, and offers a wide choice of advanced clustering algorithms, for example OPTICS clustering.

    Text clustering support was contributed by Felix Stahlberg, as part of his work on:

    Stahlberg, F., Schlippe, T., Vogel, S., & Schultz, T.
    Word segmentation through cross-lingual word-to-phoneme alignment.
    Spoken Language Technology Workshop (SLT), 2012 IEEE. IEEE, 2012.

    We would of course appreciate additional contributions.

提交回复
热议问题