问题
I'm trying to think through the most efficient way to do this in python.
Suppose I have a list of tuples:
[('dog',12,2), ('cat',15,1), ('dog',11,1), ('cat',15,2), ('dog',10,3), ('cat',16,3)]
And suppose I have a function which takes two of these tuples and combines them:
def my_reduce(obj1, obj2):
return (obj1[0],max(obj1[1],obj2[1]),min(obj1[2],obj2[2]))
How do I perform an efficient reduce by 'key' where the key here could be the first value, so the final result would be something like:
[('dog',12,1), ('cat',16,1)]
回答1:
If you want to use your my_reduce
and reduce
, you can do it this way. It's fairly short, actually:
Preparation:
from itertools import groupby
from operator import itemgetter
pets = [('dog',12,2), ('cat',15,1), ('dog',11,1), ('cat',15,2), ('dog',10,3), ('cat',16,3)]
def my_reduce(obj1, obj2):
return (obj1[0],max(obj1[1],obj2[1]),min(obj1[2],obj2[2]))
Solution:
print [reduce(my_reduce, group)
for _, group in groupby(sorted(pets), key=itemgetter(0))]
Output:
[('cat', 16, 1), ('dog', 12, 1)]
回答2:
Alternatively, if you have pandas installed:
import pandas as pd
l = [('dog',12,2), ('cat',15,1), ('dog',11,1), ('cat',15,2), ('dog',10,3), ('cat',16,3)]
pd.DataFrame(data=l, columns=['animal', 'm', 'n']).groupby('animal').agg({'m':'max', 'n':'min'})
Out[6]:
m n
animal
cat 16 1
dog 12 1
To get the original format:
zip(df.index, *df.values.T) # df is the result above
Out[14]: [('cat', 16, 1), ('dog', 12, 1)]
回答3:
I don't think reduce
is a good tool for this job, because you will have to first use itertools or similar to group the list by the key. Otherwise you will be comparing cats
and dogs
and all hell will break loose!
Instead just a simple loop is fine:
>>> my_list = [('dog',12,2), ('cat',15,1), ('dog',11,1), ('cat',15,2)]
>>> output = {}
>>> for animal, high, low in my_list:
... try:
... prev_high, prev_low = output[animal]
... except KeyError:
... output[animal] = high, low
... else:
... output[animal] = max(prev_high, high), min(prev_low, low)
Then if you want the original format back:
>>> output = [(k,) + v for k, v in output.items()]
>>> output
[('dog', 12, 1), ('cat', 15, 1)]
Note this will destroy the ordering from the original list. If you want to preserve the order the keys first appear in, initialise output with an OrderedDict
instead.
回答4:
if you really want to use reduce I think this works (it gives you a dict back instead of a list but meh)
def my_reduce(obj1, obj2):
if not isinstance(obj1,dict):
return reduce(my_reduce,[{},obj1,obj2])
try:
obj1[obj2[0]] = max(obj1[obj2[0]][0],obj2[1]),min(obj1[obj2[0]][1],obj2[2])
except KeyError:
obj1[obj2[0]] = obj2[1:]
return obj1
my_list = [('dog',12,2), ('cat',15,1), ('dog',11,1), ('cat',15,2), ('dog',10,3), ('cat',16,3)]
print reduce(my_reduce,my_list)
I think both the other solutions are better however
来源:https://stackoverflow.com/questions/29933189/reduce-by-key-in-python