I don\'t do much coding outside of Matlab, but I have a need to export my Matlab code to another language, most likely C. My Matlab code includes a histogram function, histc
The "ideal" histogram algorithm will depend upon the range you expect to capture. Generally any histogram algorithm will look like this:
const int NSAMPLES = whatever;
double samples[NSAMPLES] = { 1.0, 3.93, 1e30, ... }; // your data set
const int NBUCKETS = 10; // or whatever
int counts[NBUCKETS] = { 0 };
for (int i = 0; i != NSAMPLES; ++i) {
counts[TRANSFER(samples[i])]++;
}
where TRANSFER()
is some function that maps your inputs to a bin (0th or Nth bin mapping to "out of range" of applicable).
The exact implementation of TRANSFER()
depends a lot on the expected distribution of your sample and where you are interested in detail. Some common approaches I have seen:
If you don't know the distribution up-front, then you really can't have an efficient mechanism to bin them effectively: you'll either have to guess (biased or uninformative results) or store everything and sort it at the end, binning into equal-sized buckets (poor performance).