Simplest way to find the element that occurs the most in each column

后端 未结 4 1239
走了就别回头了
走了就别回头了 2021-01-24 17:51

Suppose I have

data =
[[a, a, c],
 [b, c, c],
 [c, b, b],
 [b, a, c]]

I want to get a list containing the element that occurs the most in each

4条回答
  •  刺人心
    刺人心 (楼主)
    2021-01-24 18:40

    Is the data hashable? If so, a collections.Counter will be helpful:

    [Counter(col).most_common(1)[0][0] for col in zip(*data)]
    

    It works because zip(*data) transposes the input data yielding 1 column at a time. The counter then counts the elements and stores the counts in a dictionary with the counts as values. Counters also have a most_common method which returns a list of the "N" items with the highest counts (sorted from most counts to least counts). So, you want to get the first element in the first item in the list returned by most_common which is where the [0][0] comes from.

    e.g.

    >>> a,b,c = 'abc'
    >>> from collections import Counter
    >>> data = [[a, a, c],
    ...  [b, c, c],
    ...  [c, b, b],
    ...  [b, a, c]]
    >>> [Counter(col).most_common(1)[0][0] for col in zip(*data)]
    ['b', 'a', 'c']
    

提交回复
热议问题