Using Python, find anagrams for a list of words

前端未结

关注

 22  886

If I have a list of strings for example:

[\"car\", \"tree\", \"boy\", \"girl\", \"arc\"...]

What should I do in order to find anagrams in t

相关标签:

22条回答

天涯浪人

2020-12-13 01:17
One solution is to sort the word you're searching anagrams for (for example using sorted), sort the alternative and compare those.

So if you would be searching for anagrams of 'rac' in the list ['car', 'girl', 'tofu', 'rca'], your code could look like this:
```
word = sorted('rac')
alternatives = ['car', 'girl', 'tofu', 'rca']

for alt in alternatives:
    if word == sorted(alt):
        print alt
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

生来不讨喜

2020-12-13 01:17

def findanagranfromlistofwords(li):
    dict = {}
    index=0
    for i in range(0,len(li)):
        originalfirst = li[index]
        sortedfirst = ''.join(sorted(str(li[index])))
        for j in range(index+1,len(li)):
            next = ''.join(sorted(str(li[j])))
            print next
            if sortedfirst == next:
                dict.update({originalfirst:li[j]})
                print "dict = ",dict
        index+=1

    print dict

findanagranfromlistofwords(["car", "tree", "boy", "girl", "arc"])

0 讨论(0)

花落未央

2020-12-13 01:20

Solution in python can be as below:

class Word:
    def __init__(self, data, index):
        self.data = data
        self.index = index

def printAnagrams(arr):
    dupArray = []
    size = len(arr)

    for i in range(size):
        dupArray.append(Word(arr[i], i))

    for i in range(size):
        dupArray[i].data = ''.join(sorted(dupArray[i].data))

    dupArray = sorted(dupArray, key=lambda x: x.data)

    for i in range(size):
        print arr[dupArray[i].index]

def main():
    arr = ["dog", "act", "cat", "god", "tac"]

    printAnagrams(arr)

if __name__== '__main__':
    main()

First create a duplicate list of same words with indexes representing their position indexes.
Then sort the individual strings of the duplicate list
Then sort the duplicate list itself based on strings.
Finally print the original list with indexes used from duplicate array.

The time complexity of above is O(NMLogN + NMLogM) = O(NMlogN)

0 讨论(0)

太阳男子

2020-12-13 01:20

A set is an appropriate data structure for the output, since you presumably don't want redundancy in the output. A dictionary is ideal for looking up if a particular sequence of letters has been previously observed, and what word it originally came from. Taking advantage of the fact that we can add the same item to a set multiple times without expanding the set lets us get away with one for loop.

def return_anagrams(word_list):
    d = {}
    out = set()
    for word in word_list:
        s = ''.join(sorted(word))
        try:
            out.add(d[s])
            out.add(word)
        except:
            d[s] = word
    return out

A faster way of doing it takes advantage of the commutative property of addition:

import numpy as np

def vector_anagram(l):
    d, out = dict(), set()
    for word in l:
        s = np.zeros(26, dtype=int)
        for c in word:
            s[ord(c)-97] += 1
        s = tuple(s)
        try:
            out.add(d[s])
            out.add(word)
        except:
            d[s] = word
    return out

0 讨论(0)

小蘑菇

2020-12-13 01:23

Sort each element then look for duplicates. There's a built-in function for sorting so you do not need to import anything

0 讨论(0)
发布评论:

提交评论
- 加载中...

予麋鹿

2020-12-13 01:23

Simple Solution in Python:

def anagram(s1,s2):

    # Remove spaces and lowercase letters
    s1 = s1.replace(' ','').lower()
    s2 = s2.replace(' ','').lower()

    # Return sorted match.
    return sorted(s1) == sorted(s2)

0 讨论(0)

1 2 3 4 下一页