Good algorithm and data structure for looking up words with missing letters?

前端 未结 20 1894
不思量自难忘°
不思量自难忘° 2020-12-07 07:12

so I need to write an efficient algorithm for looking up words with missing letters in a dictionary and I want the set of possible words.

For example, if I have th??

相关标签:
20条回答
  • 2020-12-07 07:45

    Build a hash set of all the words. To find matches, replace the question marks in the pattern with each possible combination of letters. If there are two question marks, a query consists of 262 = 676 quick, constant-expected-time hash table lookups.

    import itertools
    
    words = set(open("/usr/share/dict/words").read().split())
    
    def query(pattern):
        i = pattern.index('?')
        j = pattern.rindex('?') + 1
        for combo in itertools.product('abcdefghijklmnopqrstuvwxyz', repeat=j-i):
            attempt = pattern[:i] + ''.join(combo) + pattern[j:]
            if attempt in words:
                print attempt
    

    This uses less memory than my other answer, but it gets exponentially slower as you add more question marks.

    0 讨论(0)
  • 2020-12-07 07:47

    Have you considered using a Ternary Search Tree? The lookup speed is comparable to a trie, but it is more space-efficient.

    I have implemented this data structure several times, and it is a quite straightforward task in most languages.

    0 讨论(0)
提交回复
热议问题