Calculate rolling / moving average in C++

前端未结

关注

 10  2110

攒了一身酷 2020-12-02 08:09

I know this is achievable with boost as per:

Using boost::accumulators, how can I reset a rolling window size, does it keep extra history?

But I really would

10条回答

误落风尘 (楼主)

2020-12-02 08:55
Basically I want to track the moving average of an ongoing stream of a stream of floating point numbers using the most recent 1000 numbers as a data sample.

Note that the below updates the total_ as elements as added/replaced, avoiding costly O(N) traversal to calculate the sum - needed for the average - on demand.
```
template 
class Moving_Average
{
  public:
    void operator()(T sample)
    {
        if (num_samples_ < N)
        {
            samples_[num_samples_++] = sample;
            total_ += sample;
        }
        else
        {
            T& oldest = samples_[num_samples_++ % N];
            total_ += sample - oldest;
            oldest = sample;
        }
    }

    operator double() const { return total_ / std::min(num_samples_, N); }

  private:
    T samples_[N];
    size_t num_samples_{0};
    Total total_{0};
};
```
Total is made a different parameter from T to support e.g. using a long long when totalling 1000 longs, an int for chars, or a double to total floats.

Issues

This is a bit flawed in that num_samples_ could conceptually wrap back to 0, but it's hard to imagine anyone having 2^64 samples: if concerned, use an extra bool data member to record when the container is first filled while cycling num_samples_ around the array (best then renamed something innocuous like "pos").

Another issue is inherent with floating point precision, and can be illustrated with a simple scenario for T=double, N=2: we start with total_ = 0, then inject samples...
- 1E17, we execute total_ += 1E17, so total_ == 1E17, then inject
- 1, we execute total += 1, but total_ == 1E17 still, as the "1" is too insignificant to change the 64-bit double representation of a number as large as 1E17, then we inject
- 2, we execute total += 2 - 1E17, in which 2 - 1E17 is evaluated first and yields -1E17 as the 2 is lost to imprecision/insignificance, so to our total of 1E17 we add -1E17 and total_ becomes 0, despite current samples of 1 and 2 for which we'd want total_ to be 3. Our moving average will calculate 0 instead of 1.5. As we add another sample, we'll subtract the "oldest" 1 from total_ despite it never having been properly incorporated therein; our total_ and moving averages are likely to remain wrong.
You could add code that stores the highest recent total_ and if the current total_ is too small a fraction of that (a template parameter could provide a multiplicative threshold), you recalculate the total_ from all the samples in the samples_ array (and set highest_recent_total_ to the new total_), but I'll leave that to the reader who cares sufficiently.
0 讨论(0)

查看其它10个回答
发布评论:

提交评论
- 加载中...