string comparison with the most similar string

后端 未结 3 2200
梦谈多话
梦谈多话 2020-12-03 05:21

does anyone know if exist an algorithm that given one string A and an array of strings B, compares the A string with all the strings in B giving in output the most similar o

3条回答
  •  夕颜
    夕颜 (楼主)
    2020-12-03 05:57

    Define similarity. Algorithms that can do this include:

    1. Levenshtein/LCS/n-gram distance (compare the string with each of the strings in your set, take the one with lowest distance)
    2. tf-idf indexing
    3. Levenshtein automata
    4. Hopfield networks
    5. BK-trees

    All of which can feasibly by implemented in C or C++. Google "string similarity", "duplicate finding" or "record linkage" for the available metrics and algorithms.

提交回复
热议问题