Calculate rolling / moving average in C++

前端 未结 10 2110
攒了一身酷
攒了一身酷 2020-12-02 08:09

I know this is achievable with boost as per:

Using boost::accumulators, how can I reset a rolling window size, does it keep extra history?

But I really would

10条回答
  •  误落风尘
    2020-12-02 08:55

    Basically I want to track the moving average of an ongoing stream of a stream of floating point numbers using the most recent 1000 numbers as a data sample.

    Note that the below updates the total_ as elements as added/replaced, avoiding costly O(N) traversal to calculate the sum - needed for the average - on demand.

    template 
    class Moving_Average
    {
      public:
        void operator()(T sample)
        {
            if (num_samples_ < N)
            {
                samples_[num_samples_++] = sample;
                total_ += sample;
            }
            else
            {
                T& oldest = samples_[num_samples_++ % N];
                total_ += sample - oldest;
                oldest = sample;
            }
        }
    
        operator double() const { return total_ / std::min(num_samples_, N); }
    
      private:
        T samples_[N];
        size_t num_samples_{0};
        Total total_{0};
    };
    

    Total is made a different parameter from T to support e.g. using a long long when totalling 1000 longs, an int for chars, or a double to total floats.

    Issues

    This is a bit flawed in that num_samples_ could conceptually wrap back to 0, but it's hard to imagine anyone having 2^64 samples: if concerned, use an extra bool data member to record when the container is first filled while cycling num_samples_ around the array (best then renamed something innocuous like "pos").

    Another issue is inherent with floating point precision, and can be illustrated with a simple scenario for T=double, N=2: we start with total_ = 0, then inject samples...

    • 1E17, we execute total_ += 1E17, so total_ == 1E17, then inject

    • 1, we execute total += 1, but total_ == 1E17 still, as the "1" is too insignificant to change the 64-bit double representation of a number as large as 1E17, then we inject

    • 2, we execute total += 2 - 1E17, in which 2 - 1E17 is evaluated first and yields -1E17 as the 2 is lost to imprecision/insignificance, so to our total of 1E17 we add -1E17 and total_ becomes 0, despite current samples of 1 and 2 for which we'd want total_ to be 3. Our moving average will calculate 0 instead of 1.5. As we add another sample, we'll subtract the "oldest" 1 from total_ despite it never having been properly incorporated therein; our total_ and moving averages are likely to remain wrong.

    You could add code that stores the highest recent total_ and if the current total_ is too small a fraction of that (a template parameter could provide a multiplicative threshold), you recalculate the total_ from all the samples in the samples_ array (and set highest_recent_total_ to the new total_), but I'll leave that to the reader who cares sufficiently.

提交回复
热议问题