What is the most pythonic way to group by multiple keys and summarize/average values of a list of dictionaries in Python please? Say I have a list of dictionaries as below:<
You can put the quantities and the number of their occurrences in one big default dict:
from collections import defaultdict
counts = defaultdict(lambda: [0, 0])
for line in input_data:
entry = counts[(line['dept'], line['sku'])]
entry[0] += line['qty']
entry[1] += 1
Now it is only the question to get the numbers into a list of dicts:
sums_dict = [{'dept': k[0], 'sku': k[1], 'qty': v[0]}
for k, v in counts.items()]
avg_dict = [{'dept': k[0], 'sku': k[1], 'avg': float(v[0]) / v[1]} for
k, v in counts.items()]
The results for the sums:
sums_dict
[{'dept': '002', 'qty': 600, 'sku': 'qux'},
{'dept': '001', 'qty': 400, 'sku': 'foo'},
{'dept': '003', 'qty': 700, 'sku': 'foo'},
{'dept': '002', 'qty': 900, 'sku': 'baz'},
{'dept': '001', 'qty': 200, 'sku': 'bar'}]
and for the averages:
avg_dict
[{'avg': 600.0, 'dept': '002', 'sku': 'qux'},
{'avg': 200.0, 'dept': '001', 'sku': 'foo'},
{'avg': 700.0, 'dept': '003', 'sku': 'foo'},
{'avg': 450.0, 'dept': '002', 'sku': 'baz'},
{'avg': 200.0, 'dept': '001', 'sku': 'bar'}]
An alternative version without the default dict:
counts = {}
for line in input_data:
entry = counts.setdefault((line['dept'], line['sku']), [0, 0])
entry[0] += line['qty']
entry[1] += 1
The rest is the same:
sums_dict = [{'dept': k[0], 'sku': k[1], 'qty': v[0]}
for k, v in counts.items()]
avg_dict = [{'dept': k[0], 'sku': k[1], 'avg': float(v[0]) / v[1]} for
k, v in counts.items()]