Counting the Number of keywords in a dictionary in python

前端 未结 5 1533
离开以前
离开以前 2020-12-04 08:11

I have a list of words in a dictionary with the value = the repetition of the keyword but I only want a list of distinct words so I wanted to count the number of keywords. I

相关标签:
5条回答
  • 2020-12-04 08:35

    The number of distinct words (i.e. count of entries in the dictionary) can be found using the len() function.

    > a = {'foo':42, 'bar':69}
    > len(a)
    2
    

    To get all the distinct words (i.e. the keys), use the .keys() method.

    > list(a.keys())
    ['foo', 'bar']
    
    0 讨论(0)
  • 2020-12-04 08:40
    len(yourdict.keys())
    

    or just

    len(yourdict)
    

    If you like to count unique words in the file, you could just use set and do like

    len(set(open(yourdictfile).read().split()))
    
    0 讨论(0)
  • 2020-12-04 08:43

    Calling len() directly on your dictionary works, and is faster than building an iterator, d.keys(), and calling len() on it, but the speed of either will negligible in comparison to whatever else your program is doing.

    d = {x: x**2 for x in range(1000)}
    
    len(d)
    # 1000
    
    len(d.keys())
    # 1000
    
    %timeit len(d)
    # 41.9 ns ± 0.244 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
    
    %timeit len(d.keys())
    # 83.3 ns ± 0.41 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
    
    0 讨论(0)
  • 2020-12-04 08:44

    If the question is about counting the number of keywords then would recommend something like

    def countoccurrences(store, value):
        try:
            store[value] = store[value] + 1
        except KeyError as e:
            store[value] = 1
        return
    

    in the main function have something that loops through the data and pass the values to countoccurrences function

    if __name__ == "__main__":
        store = {}
        list = ('a', 'a', 'b', 'c', 'c')
        for data in list:
            countoccurrences(store, data)
        for k, v in store.iteritems():
            print "Key " + k + " has occurred "  + str(v) + " times"
    

    The code outputs

    Key a has occurred 2 times
    Key c has occurred 2 times
    Key b has occurred 1 times
    
    0 讨论(0)
  • 2020-12-04 08:48

    Some modifications were made on posted answer UnderWaterKremlin to make it python3 proof. A surprising result below as answer.

    System specs:

    • python =3.7.4,
    • conda = 4.8.0
    • 3.6Ghz, 8 core, 16gb.
    import timeit
    
    d = {x: x**2 for x in range(1000)}
    #print (d)
    print (len(d))
    # 1000
    
    print (len(d.keys()))
    # 1000
    
    print (timeit.timeit('len({x: x**2 for x in range(1000)})', number=100000))        # 1
    
    print (timeit.timeit('len({x: x**2 for x in range(1000)}.keys())', number=100000)) # 2
    

    Result:

    1) = 37.0100378

    2) = 37.002148899999995

    So it seems that len(d.keys()) is currently faster than just using len().

    0 讨论(0)
提交回复
热议问题