问题
I have the following list of tuples: [('a', 1), ('a', 1), ('b', 1), ('c',1), ('a', 1), ('c', 1)]
I would like to know if I can utilize python's reduce function to aggregate them and produce the following output : [('a', 3), ('b', 1), ('c', 2)]
Or if there are other ways, I would like to know as well (loop is fine)
回答1:
It seems difficult to achieve using reduce, because if both tuples that you "reduce" don't bear the same letter, you cannot compute the result. How to reduce ('a',1) and ('b',1) to some viable result?
Best I could do was l = functools.reduce(lambda x,y : (x[0],x[1]+y[1]) if x[0]==y[0] else x+y,sorted(l))
it got me ('a', 3, 'b', 1, 'c', 1, 'c', 1). So it kind of worked for the first element, but would need more than one pass to do the other ones (recreating tuples and make another similar reduce, well, not very efficient to say the least!).
Anyway, here are 2 working ways of doing it
First, using collections.Counter counting elements of the same kind:
l = [('a', 1), ('a', 1), ('b', 1), ('c',1), ('a', 1), ('c', 1)]
import collections
c = collections.Counter()
for a,i in l:
c[a] += i
We cannot use listcomp because each element has a weight (even if here it is 1)
Result: a dictionary: Counter({'a': 3, 'c': 2, 'b': 1})
Second option: use itertools.groupby on the sorted list, grouping by name/letter, and performing the sum on the integers bearing the same letter:
print ([(k,sum(e for _,e in v)) for k,v in itertools.groupby(sorted(l),key=lambda x : x[0])])
result:
[('a', 3), ('b', 1), ('c', 2)]
回答2:
The alternative approach using defaultdict subclass and sum function:
import collections
l = [('a', 1), ('a', 1), ('b', 1), ('c',1), ('a', 1), ('c', 1)]
d = collections.defaultdict(list)
for t in l:
d[t[0]].append(t[1])
result = [(k,sum(v)) for k,v in d.items()]
print(result)
The output:
[('b', 1), ('a', 3), ('c', 2)]
回答3:
Another way is that to create your custom reduce function.
for example:
l = [('a', 1), ('a', 1), ('b', 1), ('c',1), ('a', 1), ('c', 1)]
def myreduce(func , seq):
output_dict = {}
for k,v in seq:
output_dict[k] = func(output_dict.get(k,0),v)
return output_dict
myreduce((lambda sum,value:total+sum),l)
output:
{'a': 3, 'b': 1, 'c': 2}
later on you can modify the generated output as a list of tuples.
来源:https://stackoverflow.com/questions/43172488/how-can-i-create-word-count-output-in-python-just-by-using-reduce-function