Removing duplicate keys from python dictionary but summing the values

I have a dictionary in python

d = {tags[0]: value, tags[1]: value, tags[2]: value, tags[3]: value, tags[4]: value}

imagine that this dict is 10 times bigger, it has 50 keys and 50 values. Duplicates can be found in this tags but even then values are essential. How can I simply trimm it to recive new dict without duplicates of keys but with summ of values instead?

d = {'cat': 5, 'dog': 9, 'cat': 4, 'parrot': 6, 'cat': 6}

result

d = {'cat': 15, 'dog': 9, 'parrot': 6}

I'd like to improve Paul Seeb's answer:

tps = [('cat',5),('dog',9),('cat',4),('parrot',6),('cat',6)]
result = {}
for k, v in tps:
  result[k] = result.get(k, 0) + v

tps = [('cat',5),('dog',9),('cat',4),('parrot',6),('cat',6)]

from collections import defaultdict

dicto = defaultdict(int)

for k,v in tps:
    dicto[k] += v

Result:

>>> dicto
defaultdict(<type 'int'>, {'dog': 9, 'parrot': 6, 'cat': 15})

Paul Seeb

Instead of just doing dict of those things (can't have multiples of same key in a dict) I assume you can have them in a list of tuple pairs. Then it is just as easy as

tps = [('cat',5),('dog',9),('cat',4),('parrot',6),('cat',6)]
result = {}
for k,v in tps:
    try:
        result[k] += v
    except KeyError:
        result[k] = v

>>> result
{'dog': 9, 'parrot': 6, 'cat': 15}

changed mine to more explicit try-except handling. Alfe's is very concise though

Perhapse what you really want is a tuple of key-value pairs.

[('dog',1), ('cat',2), ('cat',3)]

I'm not sure what you're trying to achieve, but the Counter class might be helpful for what you're trying to do: http://docs.python.org/dev/library/collections.html#collections.Counter

This the perfect situation for using Counter data structure. Lets take a look on what it does on few familiar data structures. Lets start with good old list.

>>> from collections import Counter
>>> list_a = ["A", "A", "B", "C", "C", "A", "D"]
>>> list_b = ["B", "A", "B", "C", "C", "C", "D"]
>>> c1 = Counter(list_a)
>>> c2 = Counter(list_b)
>>> c1
Counter({'A': 3, 'C': 2, 'B': 1, 'D': 1})
>>> c2
Counter({'C': 3, 'B': 2, 'A': 1, 'D': 1})
>>> c1 - c2
Counter({'A': 2})
>>> c1 + c2
Counter({'C': 5, 'A': 4, 'B': 3, 'D': 2})
>>> c_diff = c1 - c2
>>> c_diff.update([77, 77, -99, 0, 0, 0])
>>> c_diff
Counter({0: 3, 'A': 2, 77: 2, -99: 1})

As you can see this behaves as a set that keeps the count of element occurrences as a value. Hm, but what about using a dictionary instead of a list? The dictionary in itself is a set-like structure where for values we don't have to have numbers, so how will that get handled? Lets take a look.

>>> dic1 = {"A":"a", "B":"b"}
>>> cd = Counter(dic1)
>>> cd
Counter({'B': 'b', 'A': 'a'})
>>> cd.update(B='bB123')
>>> cd
Counter({'B': 'bbB123', 'A': 'a'})


>>> dic2 = {"A":[1,2], "B": ("a", 5)}
>>> cd2 = Counter(dic2)
>>> cd2
Counter({'B': ('a', 5), 'A': [1, 2]})
>>> cd2.update(A=[42], B=(2,2))
>>> cd2
Counter({'B': ('a', 5, 2, 2), 'A': [1, 2, 42, 42, 42, 42]})
>>> cd2 = Counter(dic2)
>>> cd2
Counter({'B': ('a', 5), 'A': [1, 2]})
>>> cd2.update(A=[42], B=("new elem",))
>>> cd2
Counter({'B': ('a', 5, 'new elem'), 'A': [1, 2, 42]})

As you can see the value we are adding/changing has to be of the same type in update or it throws TypeError. As for your particular case just go with the flow

>>> d = {'cat': 5, 'dog': 9, 'cat': 4, 'parrot': 6, 'cat': 6}
>>> cd3 = Counter(d)
>>> cd3
Counter({'dog': 9, 'parrot': 6, 'cat': 6})
cd3.update(parrot=123)
cd3
Counter({'parrot': 129, 'dog': 9, 'cat': 6})

This option serves but is done with a list, or best can provide insight

data = []
        for i, j in query.iteritems():
            data.append(int(j))    
        try:
            data.sort()
        except TypeError:
            del data
        data_array = []
        for x in data:
            if x not in data_array:
                data_array.append(x)  
        return data_array

If I understand correctly your question that you want to get rid of duplicate key data, use update function of dictionary while creating the dictionary. it will overwrite the data if the key is duplicate.

tps = [('cat',5),('dog',9),('cat',4),('parrot',6),('cat',6)]
result = {}
for k, v in tps:
    result.update({k:v})
for k in result:
    print "%s: %s" % (k, result[k])

Output will look like: dog: 9 parrot: 6 cat: 6

来源：https://stackoverflow.com/questions/10654499/removing-duplicate-keys-from-python-dictionary-but-summing-the-values

标签

python

duplicates