Update a counter from a list of counters without a loop?

问题

I have a list of counters:

from collections import Counter
counters = [
    Counter({"coach": 1, "says": 1, "play": 1, "basketball": 1}),
    Counter({"i": 2, "said": 1, "hate": 1, "basketball": 1}),
    Counter({"he": 1, "said": 1, "play": 1, "basketball": 1}),
]

I can combine them using a loop as shown below, but I'd like to avoid the loop.

all_ct = Counter()
for ct in counters:
    all_ct.update(ct)

Using reduce gives an error:

all_ct = Counter()
reduce(all_ct.update, counters) 
>>> TypeError: update() takes from 1 to 2 positional arguments but 3 were given

Is there a way to combine the counters into a single counter without using a loop?

回答1:

You need to replace update() with a form that reduce can use:

def static_update(x, y):
    x.update(y)
    return x

all_ct = Counter()
functools.reduce(static_update, counters)

回答2:

U can use the sum function.

all_ct = sum(counters, Counter())

回答3:

Note, counters implement __add__ to merge the counters... So you could use:

In [3]: from collections import Counter
   ...: counters = [
   ...:     Counter({"coach": 1, "says": 1, "play": 1, "basketball": 1}),
   ...:     Counter({"i": 2, "said": 1, "hate": 1, "basketball": 1}),
   ...:     Counter({"he": 1, "said": 1, "play": 1, "basketball": 1}),
   ...: ]

In [4]: from operator import add

In [5]: from functools import reduce

In [6]: reduce(add, counters)
Out[6]:
Counter({'coach': 1,
         'says': 1,
         'play': 2,
         'basketball': 3,
         'i': 2,
         'said': 2,
         'hate': 1,
         'he': 1})

Or more simply:

In [7]: final = Counter()

In [8]: for c in counters:
   ...:     final += c
   ...:

In [9]: final
Out[9]:
Counter({'coach': 1,
         'says': 1,
         'play': 2,
         'basketball': 3,
         'i': 2,
         'said': 2,
         'hate': 1,
         'he': 1})

Note, the above is more efficient, sine it only uses one dict. If you use reduce(add, counters) it creates an new, intermediate counter object on each iteration

Just to illustrate what I mean, in the best case, where the keys are always repeated, you have to do double the work using the reduce/sum approach:

In [1]: from collections import Counter
   ...: counters = [
   ...:     Counter({"coach": 1, "says": 1, "play": 1, "basketball": 1}),
   ...:     Counter({"i": 2, "said": 1, "hate": 1, "basketball": 1}),
   ...:     Counter({"he": 1, "said": 1, "play": 1, "basketball": 1}),
   ...: ]

In [2]: counters *= 5_000

In [3]: from functools import reduce

In [4]: from operator import add

In [5]: %%timeit
   ...: data = counters.copy()
   ...: result = Counter()
   ...: for c in data:
   ...:     result += c
   ...:
21.2 ms ± 542 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [6]: %%timeit
   ...: data = counters.copy()
   ...: reduce(add, counters)
   ...:
   ...:
50.9 ms ± 1.73 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

And I believe in the worst case (where each counter has keys disjoint from each of the rest) this will degrade to quadratic performance.

Finally, note, you can do an in-place add using reduce (not sum) which elimiates the performance issue:

In [6]: import operator

In [7]: operator.iadd?
Signature: operator.iadd(a, b, /)
Docstring: Same as a += b.
Type:      builtin_function_or_method

In [8]: reduce(operator.iadd, counters, Counter())
Out[8]:
Counter({'coach': 5000,
         'says': 5000,
         'play': 10000,
         'basketball': 15000,
         'i': 10000,
         'said': 10000,
         'hate': 5000,
         'he': 5000})

And note, now the performance is on par with the explicit loop:

In [9]: %%timeit
   ...: data = counters.copy()
   ...: reduce(operator.iadd, counters, Counter())
   ...:
   ...:
22 ms ± 224 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

However, mixing functional constructs like reduce with functions that have side-effects is just... ugly. Better stick to imperative code for impure functions.

来源：https://stackoverflow.com/questions/64250703/update-a-counter-from-a-list-of-counters-without-a-loop

标签

python

functional-programming