Adding counters deletes keys

江枫思渺然 提交于 2019-11-26 18:32:21

问题


See below, why does the implementation of += blow away a key in my original counter?

>>> c = Counter({'a': 0, 'b': 0, 'c': 0})
>>> c.items()
[('a', 0), ('c', 0), ('b', 0)]
>>> c += Counter('abba')
>>> c.items()
[('a', 2), ('b', 2)]

I think that's impolite to say the least, there is quite a difference between "X was counted 0 times" and "we aren't even counting Xs". It seems like collections.Counter is not a counter at all, it's more like a multiset.

But counters are a subclass of dict and we're allowed to construct them with zero or negative values: Counter(a=0, b=-1). If it's actually a "bag of things", wouldn't this be prohibited, restricting init to accept an iterable of hashable items?

To further confuse matters, counter implements update and subtract methods which have different behaviour to + and - operators. It seems like this class is having an identity crisis!

Is a Counter a dict or a bag?


回答1:


From the source;

def __add__(self, other):
    '''Add counts from two counters.

    >>> Counter('abbb') + Counter('bcc')
    Counter({'b': 4, 'c': 2, 'a': 1})

    '''
    if not isinstance(other, Counter):
        return NotImplemented
    result = Counter()
    for elem, count in self.items():
        newcount = count + other[elem]
        if newcount > 0:
            result[elem] = newcount
    for elem, count in other.items():
        if elem not in self and count > 0:
            result[elem] = count
    return result

It seems that Counter implemented as removing keys which sums to zero non-positive keys. Since default value is zero, and the source has also zero, the resulting dict doesn't contains that key.

Maybe you can get the same behavior with update:

a.update(b)

seems to do what you want. Probably slower tho, a hand-made implementation of the __add__ method would be much faster.




回答2:


Counters are a kind of multiset. From the Counter() documentation:

Several mathematical operations are provided for combining Counter objects to produce multisets (counters that have counts greater than zero). Addition and subtraction combine counters by adding or subtracting the counts of corresponding elements. Intersection and union return the minimum and maximum of corresponding counts. Each operation can accept inputs with signed counts, but the output will exclude results with counts of zero or less.

Emphasis mine.

Further on it tells you gives you some more detail about the multiset nature of Counters:

Note: Counters were primarily designed to work with positive integers to represent running counts; however, care was taken to not unnecessarily preclude use cases needing other types or negative values. To help with those use cases, this section documents the minimum range and type restrictions.

[...]

  • The multiset methods are designed only for use cases with positive values. The inputs may be negative or zero, but only outputs with positive values are created. There are no type restrictions, but the value type needs to support addition, subtraction, and comparison.

So Counter objects are both; dictionaries and bags. Standard dictionaries, however, don't support addition, but Counters do, so it's not as if Counters are breaking a precedence set by dictionaries here.

If you wanted to retain the zeros, use Counter.update() and pass in the result of Counter.elements() of the other object:

c.update(Counter('abba').elements())

Demo:

>>> c = Counter({'a': 0, 'b': 0, 'c': 0})
>>> c.update(Counter('abba').elements())
>>> c
Counter({'a': 2, 'b': 2, 'c': 0})


来源:https://stackoverflow.com/questions/21887125/adding-counters-deletes-keys

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!