Using Python, find anagrams for a list of words

前端 未结 22 866
失恋的感觉
失恋的感觉 2020-12-13 01:11

If I have a list of strings for example:

[\"car\", \"tree\", \"boy\", \"girl\", \"arc\"...]

What should I do in order to find anagrams in t

相关标签:
22条回答
  • 2020-12-13 01:17

    One solution is to sort the word you're searching anagrams for (for example using sorted), sort the alternative and compare those.

    So if you would be searching for anagrams of 'rac' in the list ['car', 'girl', 'tofu', 'rca'], your code could look like this:

    word = sorted('rac')
    alternatives = ['car', 'girl', 'tofu', 'rca']
    
    for alt in alternatives:
        if word == sorted(alt):
            print alt
    
    0 讨论(0)
  • 2020-12-13 01:17
    def findanagranfromlistofwords(li):
        dict = {}
        index=0
        for i in range(0,len(li)):
            originalfirst = li[index]
            sortedfirst = ''.join(sorted(str(li[index])))
            for j in range(index+1,len(li)):
                next = ''.join(sorted(str(li[j])))
                print next
                if sortedfirst == next:
                    dict.update({originalfirst:li[j]})
                    print "dict = ",dict
            index+=1
    
        print dict
    
    findanagranfromlistofwords(["car", "tree", "boy", "girl", "arc"])
    
    0 讨论(0)
  • 2020-12-13 01:20

    Solution in python can be as below:

    class Word:
        def __init__(self, data, index):
            self.data = data
            self.index = index
    
    def printAnagrams(arr):
        dupArray = []
        size = len(arr)
    
        for i in range(size):
            dupArray.append(Word(arr[i], i))
    
        for i in range(size):
            dupArray[i].data = ''.join(sorted(dupArray[i].data))
    
        dupArray = sorted(dupArray, key=lambda x: x.data)
    
        for i in range(size):
            print arr[dupArray[i].index]
    
    def main():
        arr = ["dog", "act", "cat", "god", "tac"]
    
        printAnagrams(arr)
    
    if __name__== '__main__':
        main()
    
    1. First create a duplicate list of same words with indexes representing their position indexes.
    2. Then sort the individual strings of the duplicate list
    3. Then sort the duplicate list itself based on strings.
    4. Finally print the original list with indexes used from duplicate array.

    The time complexity of above is O(NMLogN + NMLogM) = O(NMlogN)

    0 讨论(0)
  • 2020-12-13 01:20

    A set is an appropriate data structure for the output, since you presumably don't want redundancy in the output. A dictionary is ideal for looking up if a particular sequence of letters has been previously observed, and what word it originally came from. Taking advantage of the fact that we can add the same item to a set multiple times without expanding the set lets us get away with one for loop.

    def return_anagrams(word_list):
        d = {}
        out = set()
        for word in word_list:
            s = ''.join(sorted(word))
            try:
                out.add(d[s])
                out.add(word)
            except:
                d[s] = word
        return out
    

    A faster way of doing it takes advantage of the commutative property of addition:

    import numpy as np
    
    def vector_anagram(l):
        d, out = dict(), set()
        for word in l:
            s = np.zeros(26, dtype=int)
            for c in word:
                s[ord(c)-97] += 1
            s = tuple(s)
            try:
                out.add(d[s])
                out.add(word)
            except:
                d[s] = word
        return out
    
    0 讨论(0)
  • 2020-12-13 01:23

    Sort each element then look for duplicates. There's a built-in function for sorting so you do not need to import anything

    0 讨论(0)
  • 2020-12-13 01:23

    Simple Solution in Python:

    def anagram(s1,s2):
    
        # Remove spaces and lowercase letters
        s1 = s1.replace(' ','').lower()
        s2 = s2.replace(' ','').lower()
    
        # Return sorted match.
        return sorted(s1) == sorted(s2)
    
    0 讨论(0)
提交回复
热议问题