nested dictionaries or tuples for key?

后端 未结 4 2197
自闭症患者
自闭症患者 2020-12-15 04:49

Suppose there is a structure like this:

{\'key1\' : { \'key2\' : { .... { \'keyn\' : \'value\' } ... } } }

Using python, I\'m trying to det

4条回答
  •  一向
    一向 (楼主)
    2020-12-15 05:00

    Performance Testing

    I performed tests for looping, retrieval, and insertion for a nested dictionary and a dictionary with a tuple. They are one level deep, 2000.000 values. I also did retrieval and insertion for the tuple dict with the tuple already created.

    These are the results. I think you cannot really bind conclusions to the std dev.

    -

    keydictinsertion: Mean +- std dev: 615 ms +- 42 ms  
    keydictretrieval: Mean +- std dev: 475 ms +- 77 ms  
    keydictlooping: Mean +- std dev: 76.2 ms +- 7.4 ms  
    
    nesteddictinsertion: Mean +- std dev: 200 ms +- 7 ms  
    nesteddictretrieval: Mean +- std dev: 256 ms +- 32 ms  
    nesteddictlooping: Mean +- std dev: 88.5 ms +- 14.4 ms  
    
    Test were the tuple was already created for the keydict  
    keydictinsertionprepared: Mean +- std dev: 432 ms +- 26 ms  
    keydictretrievalprepared: Mean +- std dev: 267 ms +- 14 ms
    

    -

    As you can can see, the nesteddict if often faster than the dict with a single key. Even when giving the keydict a tuple directly without the tuple creation step, insertion was still much slower. It seemed that the additional creation of an inner dict is not so much cost. Defaultdict has probably a fast implementation. Retrieval was actually almost equal when it did not have to create a tuple, the same with looping.

    Testing is done with perf from the command line. Scripts are below.

    >>>>>>> nesteddictinsertion
    python -m perf timeit -v -s "
    from collections import defaultdict
    " " 
    d = defaultdict(dict)
    for i in range(2000):
        for j in range(1000):
            d[i][j] = 1
    "
    >>>>>>> nesteddictlooping
    python -m perf timeit -v -s "
    from collections import defaultdict
    d = defaultdict(dict)
    for i in range(2000):
        for j in range(1000):
            d[i][j] = 1
    " "
    for i, inner_dict in d.items():
        for j, val in inner_dict.items():
            i
            j
            val
    "
    >>>>>>> nesteddictretrieval
    python -m perf timeit -v -s "
    from collections import defaultdict
    d = defaultdict(dict)
    for i in range(2000):
        for j in range(1000):
            d[i][j] = 1
    " "
    for i in range(2000):
        for j in range(1000):
            d[i][j]
    "
    >>>>>>> keydictinsertion
    python -m perf timeit -v -s "
    from collections import defaultdict
    " " 
    d = {}
    for i in range(2000):
        for j in range(1000):
            d[i, j] = 1
    "
    >>>>>>> keydictinsertionprepared
    python -m perf timeit -v -s "
    from collections import defaultdict
    keys = [(i, j) for i in range(2000) for j in range(1000)]
    " " 
    d = {}
    for key in keys:
        d[key] = 1
    "
    >>>>>>> keydictlooping
    python -m perf timeit -v -s "
    from collections import defaultdict
    d = {}
    for i in range(2000):
        for j in range(1000):
            d[i, j] = 1
    " "
    for key, val in d.items():
        key
        val
    "
    >>>>>>> keydictretrieval
    python -m perf timeit -v -s "
    from collections import defaultdict
    d = {}
    for i in range(2000):
        for j in range(1000):
            d[i, j] = 1
    " "
    for i in range(2000):
        for j in range(1000):
            d[i, j]
    "
    >>>>>>> keydictretrievalprepared
    python -m perf timeit -v -s "
    from collections import defaultdict
    d = {}
    keys = [(i, j) for i in range(2000) for j in range(1000)]
    for key in keys:
        d[key] = 1
    " "
    for key in keys:
        d[key]
    "
    

提交回复
热议问题