Counting occurrences without using collections.Counter

问题

I am trying to retrieve the most frequent and less frequent elements in a list.

frequency([13,12,11,13,14,13,7,11,13,14,12,14,14])

My output is:

([7], [13, 14])

I tried it with:

import collections
s = [13,12,11,13,14,13,7,11,13,14,12,14,14]
count = collections.Counter(s)
mins = [a for a, b in count.items() if b == min(count.values())]
maxes = [a for a, b in count.items() if b == max(count.values())]
final_vals = [mins, maxes]

But I don't want to use the collections module and try a more logic oriented solution.
Can you please help me to do it without collections?

回答1:

You could use a try and except approach with a dict to emulate a Counter.

def counter(it):
    counts = {}
    for item in it:
        try:
            counts[item] += 1
        except KeyError:
            counts[item] = 1
    return counts

or alternativly you can use dict.get with a default of 0:

def counter(it):
    counts = {}
    for item in it:
        counts[item] = counts.get(item, 0) + 1
    return counts

And you should do the min() and max() outside the comprehensions to avoid repeatedly calculating that quantity (the function is now O(n) instead of O(n^2):

def minimum_and_maximum_frequency(cnts):
    min_ = min(cnts.values())
    max_ = max(cnts.values())
    min_items = [k for k, cnt in cnts.items() if cnt == min_]
    max_items = [k for k, cnt in cnts.items() if cnt == max_]
    return min_items, max_items

This would work as expected then:

>>> minimum_and_maximum_frequency(counter([13,12,11,13,14,13,7,11,13,14,12,14,14]))
([7], [13, 14])

回答2:

data = [13,12,11,13,14,13,7,11,13,14,12,14,14]
occurrences = {}
for i in data:
     if i in occurrences.keys():
             occurrences[i] += 1
     else:
             occurrences[i] = 1

max_vals = [i for i in occurrences.keys() if occurrences[i] == max(occurrences.values())]
min_vals = [i for i in occurrences.keys() if occurrences[i] == min(occurrences.values())]

回答3:

s = [13,12,11,13,14,13,7,11,13,14,12,14,14]

occurrences = dict()

for item in s:

    occurrences[item] = occurrences.setdefault(item, 0) + 1

mins = [a for a, b in occurrences.items() if b == min(occurrences.values())]
maxs = [a for a, b in occurrences.items() if b == max(occurrences.values())]

final_vals = [mins, maxs]

The defaultdict from collections is more suitable to replace Counter to calculate occurrences of items in list. But because you limit the use of collections. So the setdefault is more elegant to deal with KeyError.

回答4:

how about this:

s = [13,12,11,13,14,13,7,11,13,14,12,14,14]
freq_low = s.count(min(s, key=s.count))
freq_high = s.count(max(s, key=s.count))
mins = [a for a in set(s) if s.count(a) == freq_low]
maxes = [a for a in set(s) if s.count(a) == freq_high]
final_vals = [mins, maxes]

回答5:

This is a very simple solution, possibly not the most efficient one (?) but a simple one.

data = get_data()
freqs, numbers = {}, {}
for i in data:
    freqs[i] = freqs.get(i, 0) + 1
for n, c in freqs.items():
    numbers[c] = numbers.get(c, []) + [n]
counts = list(numbers.keys())
res = (numbers[min(counts)], numbers[max(counts)])

Let's see in detail what we have in the script above, let's start with the example data you gave,

In [1]: data = [13,12,11,13,14,13,7,11,13,14,12,14,14]

We are going to use two dictionaries,

In [2]: freqs, numbers = {}, {}

the first one is filled iterating on data, its keys are the individual numbers in data and its values the frequency of each number in data (see fotnote for freqs.get(…))

In [3]: for i in data: freqs[i] = freqs.get(i, 0) + 1

the second one is simply a reversal of the first one, the keys being the frequencies and the values being lists of numbers with a given frequency.

In [4]: for n, c in freqs.items(): numbers[c] = numbers.get(c, []) + [n]

In [5]: numbers
Out[5]: {1: [7], 2: [12, 11], 4: [13, 14]}

At this point we need a list with the keys of numbers, i.e., the occurrences

In [6]: counts = list(numbers.keys())

because we are interested in the minimum and maximum values of the occurrences

In [7]: [numbers[min(counts)], numbers[max(counts)]]
Out[7]: [[7], [13, 14]]

Footnote: the .get(key, default_value) method of a dictionary returns a default value if the key is not present in the dictionary, we used this feature with a 0 default value to sum the occurrences of individual numbers, and with [], the void list, to build a list of all the numbers with given frequency.

来源：https://stackoverflow.com/questions/45955740/counting-occurrences-without-using-collections-counter

标签

python

counting