Fuzzy Text Matching C#

前端 未结 1 1709
时光取名叫无心
时光取名叫无心 2020-12-28 15:28

I\'m writing a desktop UI (.Net WinForms) to assist a photographer clean up his image meta data. There is a list of 66k+ phrases. Can anyone suggest a good open source/free

相关标签:
1条回答
  • 2020-12-28 16:05

    Let me introduce you to the Levenshtein distance formula. It is awesome:

    http://en.wikipedia.org/wiki/Levenshtein_distance

    In information theory and computer science, the Levenshtein distance is a string metric for measuring the amount of difference between two sequences. The term edit distance is often used to refer specifically to Levenshtein distance.

    Personally I used this in a healthcare setting, where Provider names were checked for duplicates. Using the Levenshtein process, we gave them a confidence rating and allowed them to determine if it was a true duplicate or something unique.

    0 讨论(0)
提交回复
热议问题