string comparison with the most similar string

后端 未结 3 2184
梦谈多话
梦谈多话 2020-12-03 05:21

does anyone know if exist an algorithm that given one string A and an array of strings B, compares the A string with all the strings in B giving in output the most similar o

相关标签:
3条回答
  • 2020-12-03 05:57

    The usual measurement for this is the Levenshtein distance. Compute the Levenshtein distance from the original to each candidate, and take the smallest distance as the most likely candidate.

    0 讨论(0)
  • 2020-12-03 05:57

    Define similarity. Algorithms that can do this include:

    1. Levenshtein/LCS/n-gram distance (compare the string with each of the strings in your set, take the one with lowest distance)
    2. tf-idf indexing
    3. Levenshtein automata
    4. Hopfield networks
    5. BK-trees

    All of which can feasibly by implemented in C or C++. Google "string similarity", "duplicate finding" or "record linkage" for the available metrics and algorithms.

    0 讨论(0)
  • 2020-12-03 06:11

    This is usually done with checking a bunch of variations of the string that you have ... take a look at spelling correction algorithms - e.g. here

    0 讨论(0)
提交回复
热议问题