Using Python, find anagrams for a list of words

前端 未结 22 870
失恋的感觉
失恋的感觉 2020-12-13 01:11

If I have a list of strings for example:

[\"car\", \"tree\", \"boy\", \"girl\", \"arc\"...]

What should I do in order to find anagrams in t

相关标签:
22条回答
  • 2020-12-13 01:23

    This one is gonna help you:

    Assuming input is given as comma separated strings

    console input: abc,bac,car,rac,pqr,acb,acr,abc

    in_list = list()
    in_list = map(str, raw_input("Enter strings seperated by comma").split(','))
    list_anagram = list()
    
    for i in range(0, len(in_list) - 1):
        if sorted(in_list[i]) not in list_anagram:
            for j in range(i + 1, len(in_list)):
                isanagram = (sorted(in_list[i]) == sorted(in_list[j]))
                if isanagram:
                    list_anagram.append(sorted(in_list[i]))
                    print in_list[i], 'isanagram'
                    break
    
    0 讨论(0)
  • 2020-12-13 01:23

    Simply use the Counter method available in Python3 collections package.

    str1="abc"
    str2="cab"
    
    Counter(str1)==Counter(str2)
    # returns True i.e both Strings are anagrams of each other.
    
    0 讨论(0)
  • 2020-12-13 01:25

    Since you can't import anything, here are two different approaches including the for loop you asked for.

    Approach 1: For Loops and Inbuilt Sorted Function

    word_list = ["percussion", "supersonic", "car", "tree", "boy", "girl", "arc"]
    
    # initialize a list
    anagram_list = []
    for word_1 in word_list: 
        for word_2 in word_list: 
            if word_1 != word_2 and (sorted(word_1)==sorted(word_2)):
                anagram_list.append(word_1)
    print(anagram_list)
    

    Approach 2: Dictionaries

    def freq(word):
        freq_dict = {}
        for char in word:
            freq_dict[char] = freq_dict.get(char, 0) + 1
        return freq_dict
    
    # initialize a list
    anagram_list = []
    for word_1 in word_list: 
        for word_2 in word_list: 
            if word_1 != word_2 and (freq(word_1) == freq(word_2)):
                anagram_list.append(word_1)
    print(anagram_list)
    

    If you want these approaches explained in more detail, here is an article.

    0 讨论(0)
  • 2020-12-13 01:28

    I'm using a dictionary to store each character of string one by one. Then iterate through second string and find the character in the dictionary, if it's present decrease the count of the corresponding key from dictionary.

    class Anagram:
    
        dict = {}
    
        def __init__(self):
            Anagram.dict = {}
    
        def is_anagram(self,s1, s2):
            print '***** starting *****'
    
            print '***** convert input strings to lowercase'
            s1 = s1.lower()
            s2 = s2.lower()
    
            for i in s1:
               if i not in Anagram.dict:
                  Anagram.dict[i] = 1
               else:
                  Anagram.dict[i] += 1
    
            print Anagram.dict
    
            for i in s2:
               if i not in Anagram.dict:
                  return false
               else:
                  Anagram.dict[i] -= 1
    
            print Anagram.dict
    
           for i in Anagram.dict.keys():
              if Anagram.dict.get(i) == 0:
                  del Anagram.dict[i]
    
           if len(Anagram.dict) == 0:
             print Anagram.dict
             return True
           else:
             return False
    
    0 讨论(0)
  • 2020-12-13 01:30
    import collections
    
    def find_anagrams(x):
        anagrams = [''.join(sorted(list(i))) for i in x]
        anagrams_counts = [item for item, count in collections.Counter(anagrams).items() if count > 1]
        return [i for i in x if ''.join(sorted(list(i))) in anagrams_counts]
    
    0 讨论(0)
  • 2020-12-13 01:33

    There are multiple solutions to this problem:

    1. Classic approach

      First, let's consider what defines an anagram: two words are anagrams of each other if they consist of the same set of letters and each letter appears exactly the same number or time in both words. This is basically a histogram of letters count of each word. This is a perfect use case for collections.Counter data structure (see docs). The algorithms is as follows:

      • Build a dictionary where keys would be histograms and values would be lists of words that have this histogram.
      • For each word build it's histogram and add it to the list that corresponds to this histogram.
      • Output list of dictionary values.

      Here is the code:

      from collections import Counter, defaultdict
      
      def anagram(words):
          anagrams = defaultdict(list)
          for word in words:
              histogram = tuple(Counter(word).items()) # build a hashable histogram
              anagrams[histogram].append(word)
          return list(anagrams.values())
      
      keywords = ("hi", "hello", "bye", "helol", "abc", "cab", 
                      "bac", "silenced", "licensed", "declines")
      
      print(anagram(keywords))
      

      Note that constructing Counter is O(l), while sorting each word is O(n*log(l)) where l is the length of the word.

    2. Solving anagrams using prime numbers

      This is a more advanced solution, that relies on the "multiplicative uniqueness" of prime numbers. You can refer to this SO post: Comparing anagrams using prime numbers, and here is a sample python implementation.

    0 讨论(0)
提交回复
热议问题