Counting the number of unique words in a document with Python

后端 未结 6 1691
一个人的身影
一个人的身影 2020-12-02 00:05

I am Python newbie trying to understand the answer given here to the question of counting unique words in a document. The answer is:

print len(set(w.lower()         


        
6条回答
  •  感动是毒
    2020-12-02 00:36

    You can calculate the number of items in a set, list or tuple all the same with len(my_set) or len(my_list).

    Edit: Calculating the numbers of times a word is used, is something different.
    Here the obvious approach:

    count = {}
    for w in open('filename.dat').read().split():
        if w in count:
            count[w] += 1
        else:
            count[w] = 1
    for word, times in count.items():
        print "%s was found %d times" % (word, times)
    

    If you want to avoid the if-clause, you can look at collections.defaultdict.

提交回复
热议问题