python dict: get vs setdefault

前端 未结 8 1076
误落风尘
误落风尘 2020-12-04 15:04

The following two expressions seem equivalent to me. Which one is preferable?

data = [(\'a\', 1), (\'b\', 1), (\'b\', 2)]

d1 = {}
d2 = {}

for key, val in d         


        
相关标签:
8条回答
  • 2020-12-04 15:53

    You might want to look at defaultdict in the collections module. The following is equivalent to your examples.

    from collections import defaultdict
    
    data = [('a', 1), ('b', 1), ('b', 2)]
    
    d = defaultdict(list)
    
    for k, v in data:
        d[k].append(v)
    

    There's more here.

    0 讨论(0)
  • 2020-12-04 15:53

    The logic of dict.get is:

    if key in a_dict:
        value = a_dict[key] 
    else: 
        value = default_value
    

    Take an example:

    In [72]: a_dict = {'mapping':['dict', 'OrderedDict'], 'array':['list', 'tuple']}
    In [73]: a_dict.get('string', ['str', 'bytes'])
    Out[73]: ['str', 'bytes']
    In [74]: a_dict.get('array', ['str', 'byets'])
    Out[74]: ['list', 'tuple']
    

    The mechamism of setdefault is:

        levels = ['master', 'manager', 'salesman', 'accountant', 'assistant']
        #group them by the leading letter
        group_by_leading_letter = {}
        # the logic expressed by obvious if condition
        for level in levels:
            leading_letter = level[0]
            if leading_letter not in group_by_leading_letter:
                group_by_leading_letter[leading_letter] = [level]
            else:
                group_by_leading_letter[leading_letter].append(word)
        In [80]: group_by_leading_letter
        Out[80]: {'a': ['accountant', 'assistant'], 'm': ['master', 'manager'], 's': ['salesman']}
    

    The setdefault dict method is for precisely this purpose. The preceding for loop can be rewritten as:

    In [87]: for level in levels:
        ...:     leading = level[0]
        ...:     group_by_leading_letter.setdefault(leading,[]).append(level)
    Out[80]: {'a': ['accountant', 'assistant'], 'm': ['master', 'manager'], 's': ['salesman']}
    

    It's very simple, means that either a non-null list append an element or a null list append an element.

    The defaultdict, which makes this even easier. To create one, you pass a type or function for generating the default value for each slot in the dict:

    from collections import defualtdict
    group_by_leading_letter = defaultdict(list)
    for level in levels:
        group_by_leading_letter[level[0]].append(level)
    
    0 讨论(0)
  • 2020-12-04 15:55

    The accepted answer from agf isn't comparing like with like. After:

    print timeit("d[0] = d.get(0, []) + [1]", "d = {1: []}", number = 10000)
    

    d[0] contains a list with 10,000 items whereas after:

    print timeit("d.setdefault(0, []) + [1]", "d = {1: []}", number = 10000)
    

    d[0] is simply []. i.e. the d.setdefault version never modifies the list stored in d. The code should actually be:

    print timeit("d.setdefault(0, []).append(1)", "d = {1: []}", number = 10000)
    

    and in fact is faster than the faulty setdefault example.

    The difference here really is because of when you append using concatenation the whole list is copied every time (and once you have 10,000 elements that is beginning to become measurable. Using append the list updates are amortised O(1), i.e. effectively constant time.

    Finally, there are two other options not considered in the original question: defaultdict or simply testing the dictionary to see whether it already contains the key.

    So, assuming d3, d4 = defaultdict(list), {}

    # variant 1 (0.39)
    d1[key] = d1.get(key, []) + [val]
    # variant 2 (0.003)
    d2.setdefault(key, []).append(val)
    # variant 3 (0.0017)
    d3[key].append(val)
    # variant 4 (0.002)
    if key in d4:
        d4[key].append(val)
    else:
        d4[key] = [val]
    

    variant 1 is by far the slowest because it copies the list every time, variant 2 is the second slowest, variant 3 is the fastest but won't work if you need Python older than 2.5, and variant 4 is just slightly slower than variant 3.

    I would say use variant 3 if you can, with variant 4 as an option for those occasional places where defaultdict isn't an exact fit. Avoid both of your original variants.

    0 讨论(0)
  • 2020-12-04 15:55

    For those who are still struggling in understanding these two term, let me tell you basic difference between get() and setdefault() method -

    Scenario-1

    root = {}
    root.setdefault('A', [])
    print(root)
    

    Scenario-2

    root = {}
    root.get('A', [])
    print(root)
    

    In Scenario-1 output will be {'A': []} while in Scenario-2 {}

    So setdefault() sets absent keys in the dict while get() only provides you default value but it does not modify the dictionary.

    Now let come where this will be useful- Suppose you are searching an element in a dict whose value is a list and you want to modify that list if found otherwise create a new key with that list.

    using setdefault()

    def fn1(dic, key, lst):
        dic.setdefault(key, []).extend(lst)
    

    using get()

    def fn2(dic, key, lst):
        dic[key] = dic.get(key, []) + (lst) #Explicit assigning happening here
    

    Now lets examine timings -

    dic = {}
    %%timeit -n 10000 -r 4
    fn1(dic, 'A', [1,2,3])
    

    Took 288 ns

    dic = {}
    %%timeit -n 10000 -r 4
    fn2(dic, 'A', [1,2,3])
    

    Took 128 s

    So there is a very large timing difference between these two approaches.

    0 讨论(0)
  • 2020-12-04 16:01
    In [1]: person_dict = {}
    
    In [2]: person_dict['liqi'] = 'LiQi'
    
    In [3]: person_dict.setdefault('liqi', 'Liqi')
    Out[3]: 'LiQi'
    
    In [4]: person_dict.setdefault('Kim', 'kim')
    Out[4]: 'kim'
    
    In [5]: person_dict
    Out[5]: {'Kim': 'kim', 'liqi': 'LiQi'}
    
    In [8]: person_dict.get('Dim', '')
    Out[8]: ''
    
    In [5]: person_dict
    Out[5]: {'Kim': 'kim', 'liqi': 'LiQi'}
    
    0 讨论(0)
  • 2020-12-04 16:02

    1. Explained with a good example here:
    http://code.activestate.com/recipes/66516-add-an-entry-to-a-dictionary-unless-the-entry-is-a/

    dict.setdefault typical usage
    somedict.setdefault(somekey,[]).append(somevalue)

    dict.get typical usage
    theIndex[word] = 1 + theIndex.get(word,0)


    2. More explanation : http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html

    dict.setdefault() is equivalent to get or set & get. Or set if necessary then get. It's especially efficient if your dictionary key is expensive to compute or long to type.

    The only problem with dict.setdefault() is that the default value is always evaluated, whether needed or not. That only matters if the default value is expensive to compute. In that case, use defaultdict.


    3. Finally the official docs with difference highlighted http://docs.python.org/2/library/stdtypes.html

    get(key[, default])
    Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.

    setdefault(key[, default])
    If key is in the dictionary, return its value. If not, insert key with a value of default and return default. default defaults to None.

    0 讨论(0)
提交回复
热议问题