How to compute “shortest distance” between two words?

丶灬走出姿态 提交于 2019-12-05 02:54:36

问题


Recently I had an interview and I was asked to write a algorithm to find the minimum number of 1 letter changes to get from a particular of word to a given word , i.e. Cat->Cot->Cog->Dog

I dont want the solution of the problem just guide me through How I can use BFS in this algorithm ?


回答1:


according to this scrabble list, the shortest path between cat and dog is: ['CAT', 'COT', 'COG', 'DOG']

from urllib import urlopen

def get_words():
    try:
        html = open('three_letter_words.txt').read()
    except IOError:
        html = urlopen('http://www.yak.net/kablooey/scrabble/3letterwords.html').read()
        with open('three_letter_words.txt', 'w') as f:
            f.write(html)

    b = html.find('<PRE>') #ignore the html before the <pre>
    while True:
        a = html.find("<B>", b) + 3
        b = html.find("</B>", a)
        word = html[a: b]
        if word == "ZZZ":
            break
        assert(len(word) == 3)
        yield word

words = list(get_words())

def get_template(word):
    c1, c2, c3 = word[0], word[1], word[2]
    t1 = 1, c1, c2
    t2 = 2, c1, c3
    t3 = 3, c2, c3
    return t1, t2, t3

d = {}
for word in words:
    template = get_template(word)
    for ti in template:
        d[ti] = d.get(ti, []) + [word] #add the word to the set of words with that template

for ti in get_template('COG'):
    print d[ti]
#['COB', 'COD', 'COG', 'COL', 'CON', 'COO', 'COO', 'COP', 'COR', 'COS', 'COT', 'COW', 'COX', 'COY', 'COZ']
#['CIG', 'COG']
# ['BOG', 'COG', 'DOG', 'FOG', 'HOG', 'JOG', 'LOG', 'MOG', 'NOG', 'TOG', 'WOG']

import networkx
G = networkx.Graph()

for word_list in d.values():
    for word1 in word_list:
        for word2 in word_list:
            if word1 != word2:
                G.add_edge(word1, word2)

print G['COG']
#{'COP': {}, 'COS': {}, 'COR': {}, 'CIG': {}, 'COT': {}, 'COW': {}, 'COY': {}, 'COX': {}, 'COZ': {}, 'DOG': {}, 'CON': {}, 'COB': {}, 'COD': {}, 'COL': {}, 'COO': {}, 'LOG': {}, 'TOG': {}, 'JOG': {}, 'BOG': {}, 'HOG': {}, 'FOG': {}, 'WOG': {}, 'NOG': {}, 'MOG': {}}

print networkx.shortest_path(G, 'CAT', 'DOG')
['CAT', 'OCA', 'DOC', 'DOG']

As a bonus we can get the farthest:

print max(networkx.all_pairs_shortest_path(G, 'CAT')['CAT'].values(), key=len)
#['CAT', 'CAP', 'YAP', 'YUP', 'YUK']



回答2:


At first sight I thaught about Levenshtein distance but you need to use BFS. So I think that you should start from building tree. Given word should be root and then next nodes are words with changed first letter. Next next nodes have changed second letter. When you build the graph you use BFS and when you found new word store the path length. At the end of algorithm choose minimal distance.




回答3:


  1. Begin with just the starting word in your path set.

  2. If the ending word of any path in your path set is the desired word, stop, that path is the desired path.

  3. Replace each path in your path set with every possible path that starts with that path but is one word longer.

  4. Go to step 2.




回答4:


If we start to build a directed acyclic graph from the destination word to the source word, in a breadth-wise fashion, and we do a dictionary look-up to verify if we have seen the word earlier in the tree while adding the word, then the first occurrence of the source word,should give the shortest path in the reverse direction from the 'target word' to the 'source word'.

From this we can print the path from the 'source' to the 'target'



来源:https://stackoverflow.com/questions/11811918/how-to-compute-shortest-distance-between-two-words

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!