Levenshtein distance based methods Vs Soundex

前端 未结 4 666
灰色年华
灰色年华 2020-12-03 01:51

As per this comment in a related thread, I\'d like to know why Levenshtein distance based methods are better than Soundex.

4条回答
  •  青春惊慌失措
    2020-12-03 02:39

    I would suggest using Metaphone, not Soundex. As noted, Soundex was developed in the 19th century for American names. Metaphone will give you some results when checking the work of poor spellers who are "sounding it out", and spelling phonetically.

    Edit distance is good at catching typos such as repeated letters, transposed letters, or hitting the wrong key.

    Consider the application to decide which will fit your users best—or use both together, with Metaphone complementing the suggestions produced by Levenshtein.

    With regard to the original question, I've used n-grams successfully in information retrieval applications.

提交回复
热议问题