I know this is achievable with boost as per:
Using boost::accumulators, how can I reset a rolling window size, does it keep extra history?
But I really would like to avoid using boost. I have googled and not found any suitable or readable examples.
Basically I want to track the moving average of an ongoing stream of a stream of floating point numbers using the most recent 1000 numbers as a data sample.
What is the easiest way to achieve this?
I experimented with using a circular array, exponential moving average and a more simple moving average and found that the results from the circular array suited my needs best.
You simply need a circular array of 1000 elements, where you add the element to the previous element and store it... It becomes an increasing sum, where you can always get the sum between any two pairs of elements, and divide by the number of elements between them, to yield the average.
If your needs are simple, you might just try using an exponential moving average.
http://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average
Put simply, you make an accumulator variable, and as your code looks at each sample, the code updates the accumulator with the new value. You pick a constant "alpha" that is between 0 and 1, and compute this:
accumulator = (alpha * new_value) + (1.0 - alpha) * accumulator
You just need to find a value of "alpha" where the effect of a given sample only lasts for about 1000 samples.
Hmm, I'm not actually sure this is suitable for you, now that I've put it here. The problem is that 1000 is a pretty long window for an exponential moving average; I'm not sure there is an alpha that would spread the average over the last 1000 numbers, without underflow in the floating point calculation. But if you wanted a smaller average, like 30 numbers or so, this is a very easy and fast way to do it.
You can approximate a rolling average by applying a weighted average on your input stream.
template <unsigned N>
double approxRollingAverage (double avg, double input) {
avg -= avg/N;
avg += input/N;
return avg;
}
This way, you don't need to maintain 1000 buckets. However, it is an approximation, so it's value will not match exactly with a true rolling average.
Edit: Just noticed @steveha's post. This is equivalent to the exponential moving average, with the alpha being 1/N (I was taking N to be 1000 in this case to simulate 1000 buckets).
Basically I want to track the moving average of an ongoing stream of a stream of floating point numbers using the most recent 1000 numbers as a data sample.
Note that the below updates the total_
as elements as added/replaced, avoiding costly O(N) traversal to calculate the sum - needed for the average - on demand.
template <typename T, typename Total, size_t N>
class Moving_Average
{
public:
void operator()(T sample)
{
if (num_samples_ < N)
{
samples_[num_samples_++] = sample;
total_ += sample;
}
else
{
T& oldest = samples_[num_samples_++ % N];
total_ += sample - oldest;
oldest = sample;
}
}
operator double() const { return total_ / std::min(num_samples_, N); }
private:
T samples_[N];
size_t num_samples_{0};
Total total_{0};
};
Total
is made a different parameter from T
to support e.g. using a long long
when totalling 1000 long
s, an int
for char
s, or a double
to total float
s.
Issues
This is a bit flawed in that num_samples_
could conceptually wrap back to 0, but it's hard to imagine anyone having 2^64 samples: if concerned, use an extra bool data member to record when the container is first filled while cycling num_samples_
around the array (best then renamed something innocuous like "pos
").
Another issue is inherent with floating point precision, and can be illustrated with a simple scenario for T=double, N=2: we start with total_ = 0
, then inject samples...
1E17, we execute
total_ += 1E17
, sototal_ == 1E17
, then inject1, we execute
total += 1
, buttotal_ == 1E17
still, as the "1" is too insignificant to change the 64-bitdouble
representation of a number as large as 1E17, then we inject2, we execute
total += 2 - 1E17
, in which2 - 1E17
is evaluated first and yields-1E17
as the 2 is lost to imprecision/insignificance, so to our total of 1E17 we add -1E17 andtotal_
becomes 0, despite current samples of 1 and 2 for which we'd wanttotal_
to be 3. Our moving average will calculate 0 instead of 1.5. As we add another sample, we'll subtract the "oldest" 1 fromtotal_
despite it never having been properly incorporated therein; ourtotal_
and moving averages are likely to remain wrong.
You could add code that stores the highest recent total_
and if the current total_
is too small a fraction of that (a template parameter could provide a multiplicative threshold), you recalculate the total_
from all the samples in the samples_
array (and set highest_recent_total_
to the new total_
), but I'll leave that to the reader who cares sufficiently.
Simple class to calculate rolling average and also rolling standard deviation:
#define _stdev(cnt, sum, ssq) sqrt((((double)(cnt))*ssq-pow((double)(sum),2)) / ((double)(cnt)*((double)(cnt)-1)))
class moving_average {
private:
boost::circular_buffer<int> *q;
double sum;
double ssq;
public:
moving_average(int n) {
sum=0;
ssq=0;
q = new boost::circular_buffer<int>(n);
}
~moving_average() {
delete q;
}
void push(double v) {
if (q->size() == q->capacity()) {
double t=q->front();
sum-=t;
ssq-=t*t;
q->pop_front();
}
q->push_back(v);
sum+=v;
ssq+=v*v;
}
double size() {
return q->size();
}
double mean() {
return sum/size();
}
double stdev() {
return _stdev(size(), sum, ssq);
}
};
You could implement a ring buffer. Make an array of 1000 elements, and some fields to store the start and end indexes and total size. Then just store the last 1000 elements in the ring buffer, and recalculate the average as needed.
a simple moving average for 10 items, using a list:
#include <list>
std::list<float> listDeltaMA;
float getDeltaMovingAverage(float delta)
{
listDeltaMA.push_back(delta);
if (listDeltaMA.size() > 10) listDeltaMA.pop_front();
float sum = 0;
for (std::list<float>::iterator p = listDeltaMA.begin(); p != listDeltaMA.end(); ++p)
sum += (float)*p;
return sum / listDeltaMA.size();
}
One way can be to circularly store the values in the buffer array. and calculate average this way.
int j = (int) (counter % size);
buffer[j] = mostrecentvalue;
avg = (avg * size - buffer[j - 1 == -1 ? size - 1 : j - 1] + buffer[j]) / size;
counter++;
// buffer[j - 1 == -1 ? size - 1 : j - 1] is the oldest value stored
The whole thing runs in a loop where most recent value is dynamic.
I use this quite often in hard realtime systems that have fairly insane update rates (50kilosamples/sec) As a result I typically precompute the scalars.
To compute a moving average of N samples: scalar1 = 1/N; scalar2 = 1 - scalar1; // or (1 - 1/N) then:
Average = currentSample*scalar1 + Average*scalar2;
Example: Sliding average of 10 elements
double scalar1 = 1.0/10.0; // 0.1
double scalar2 = 1.0 - scalar1; // 0.9
bool first_sample = true;
double average=0.0;
while(someCondition)
{
double newSample = getSample();
if(first_sample)
{
// everybody forgets the initial condition *sigh*
average = newSample;
first_sample = false;
}
else
{
average = (sample*scalar1) + (average*scalar2);
}
}
Note: this is just a practical implementation of the answer given by steveha above. Sometimes it's easier to understand a concrete example.
来源:https://stackoverflow.com/questions/10990618/calculate-rolling-moving-average-in-c