Algorithm to compute mode

落爺英雄遲暮 提交于 2019-12-01 04:14:57

Even though you have some good answers already, I decided to post another. I'm not sure it really adds a lot that's new, but I'm not at all sure it doesn't either. If nothing else, I'm pretty sure it uses more standard headers than any of the other answers. :-)

#include <vector>
#include <algorithm>
#include <unordered_map>
#include <map>
#include <iostream>
#include <utility>
#include <functional>
#include <numeric>

int main() {
    std::vector<int> inputs{ 1, 1, 1, 1, 2, 2, 2 };

    std::unordered_map<int, size_t> counts;
    for (int i : inputs)
        ++counts[i];

    std::multimap<size_t, int, std::greater<size_t> > inv;
    for (auto p : counts)
        inv.insert(std::make_pair(p.second, p.first));

    auto e = inv.upper_bound(inv.begin()->first);

    double sum = std::accumulate(inv.begin(),
        e,
        0.0,
        [](double a, std::pair<size_t, int> const &b) {return a + b.second; });

    std::cout << sum / std::distance(inv.begin(), e);
}

Compared to @Dietmar's answer, this should be faster if you have a lot of repetition in the numbers, but his will probably be faster if the numbers are mostly unique.

Based on the comment, it seems you need to find the values which occur most often and if there are multiple values occurring the same amount of times, you need to produce the average of these. It seems, this can easily be done by std::sort() following by a traversal finding where values change and keeping a few running counts:

template <int Size>
double mode(int const (&x)[Size]) {
    std::vector<int> tmp(x, x + Size);
    std::sort(tmp.begin(), tmp.end());
    int    size(0);  // size of the largest set so far
    int    count(0); // number of largest sets
    double sum(0);    // sum of largest sets
    for (auto it(tmp.begin()); it != tmp.end(); ) {
        auto end(std::upper_bound(it, tmp.end(), *it));
        if (size == std::distance(it, end)) {
            sum += *it;
            ++count;
        }
        else if (size < std::distance(it, end)) {
            size = std::distance(it, end);
            sum = *it;
            count = 1;
        }
        it = end;
    }
    return sum / count;
}

If you simply wish to count the number of occurences then I suggest you use a std::map or std::unordered_map.

If you're mapping a counter to each distinct value then it's an easy task to count occurences using std::map as each key can only be inserted once. To list the distinct numbers in your list simply iterate over the map.

Here's an example of how you could do it:

#include <cstddef>
#include <map>
#include <algorithm>
#include <iostream>

std::map<int, int> getOccurences(const int arr[], const std::size_t len) {
    std::map<int, int> m;
    for (std::size_t i = 0; i != len; ++i) {
        m[arr[i]]++;
    }
    return m;
}

int main() {
    int list[7]{1, 1, 1, 1, 2, 2, 2};
    auto occurences = getOccurences(list, 7);
    for (auto e : occurences) {
        std::cout << "Number " << e.first << " occurs ";
        std::cout << e.second << " times" << std::endl;
    }
    auto average = std::accumulate(std::begin(list), std::end(list), 0.0) / 7;
    std::cout << "Average is " << average << std::endl;
}

Output:

Number 1 occurs 4 times
Number 2 occurs 3 times
Average is 1.42857

Here's a working version of your code. m stores the values in the array and q stores their counts. At the end it runs through all the values to get the maximal count, the sum of the modes, and the number of distinct modes.

float mode(int x[],int n)
{
    //Copy array and sort it
    int y[n], temp, j = 0, k = 0, m[n], q[n];

    for(int i = 0; i < n; i++)
        y[i] = x[i];

    for(int pass = 0; pass < n - 1; pass++)
        for(int pos = 0; pos < n; pos++)
            if(y[pass] > y[pos]) {
                temp = y[pass];
                y[pass] = y[pos];
                y[pos] = temp;
            }   

    for(int i = 0; i < n;){
        j = i;
        while (y[j] == y[i]) {
          j++;
        }   
        m[k] = y[i];
        q[k] = j - i;
        k++;
        i = j;
    }   

    int max = 0;
    int modes_count = 0;
    int modes_sum = 0;
    for (int i=0; i < k; i++) {
        if (q[i] > max) {
            max = q[i];
            modes_count = 1;
            modes_sum = m[i];
        } else if (q[i] == max) {
            modes_count += 1;
            modes_sum += m[i];
        }   
    }   

    return modes_sum / modes_count;
}
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!