How do I use a Trie for spell checking

五迷三道 提交于 2019-12-21 03:57:29

问题


I have a trie that I've built from a dictionary of words. I want to use this for spell checking( and suggest closest matches in the dictionary , maybe for a given number of edits x). I'm thinking I'd use levenshtein distance between the target word and words in my dictionary, but is there a smart way to traverse the trie without actually running the edit distance logic over each word separately? How should I do the traversal and the edit distance matching?

For e.g, if I have words MAN, MANE, I should be able to reuse the edit distance computation on MAN in MANE. Otherwise the Trie wouldnt serve any purpose


回答1:


Try computing for each tree node an array A where A[x] the smallest edit distance to be at that position in the trie after matching the first x letters of the target word.

You can then stop examining any nodes if every element in the array is greater than your target distance.

For example, with a trie containing MAN and MANE and an input BANE:

Node 0 representing '', A=[0,1,2,3,4]
Node 1 representing 'M', A=[1,1,2,3,4]
Node 2 representing 'MA', A=[2,1,1,2,3]
Node 3 representing 'MAN' A=[3,2,2,1,2]
Node 4 representing 'MANE' A=[4,3,2,2,1]

The smallest value for A[end] is 1 reached with the word 'MANE' so this is the best match.




回答2:


I think you should instead give a try to bk-trees; it's a data structure that fits well spell-checking as it will allow you to compute efficiently the edit distance with the words of your dictionary.

This link gives a good insight into BK-trees applied to spell-checking



来源:https://stackoverflow.com/questions/21366631/how-do-i-use-a-trie-for-spell-checking

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!