发表新帖

发表新帖

How does clustering (especially String clustering) work?

前端未结

关注

 3  1208

轻奢々 2020-12-07 17:56

I heard about clustering to group similar data. I want to know how it works in the specific case for String.

I have a table with more than different 100,000 words. <

3条回答

难免孤独 (楼主)

2020-12-07 18:29

You can use an algorithm like the Levenshtein distance for the distance calculation and k-means for clustering.

the Levenshtein distance is a string metric for measuring the amount of difference between two sequences

Do some testing and find a similarity threshold per word that will decide your groups.

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题