How to calculate moving average in Python 3?

后端未结

关注

 5  1662

Let\'s say I have a list:

y = [\'1\', \'2\', \'3\', \'4\',\'5\',\'6\',\'7\',\'8\',\'9\',\'10\']

I want to create a function that calculates

相关标签:

5条回答

天涯浪人

2020-12-06 02:31
Use the sum and map functions.
```
print(sum(map(int, x[num-n:num])))
```
The map function in Python 3 is basically a lazy version of this:
```
[int(i) for i in x[num-n:num]]
```
I'm sure you can guess what the sum function does.
0 讨论(0)
发布评论:

提交评论
- 加载中...

隐瞒了意图╮

2020-12-06 02:32

There is a great sliding window generator in an old version of the Python docs with itertools examples:

from itertools import islice

def window(seq, n=2):
    "Returns a sliding window (of width n) over data from the iterable"
    "   s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ...                   "
    it = iter(seq)
    result = tuple(islice(it, n))
    if len(result) == n:
        yield result    
    for elem in it:
        result = result[1:] + (elem,)
        yield result

Using that your moving averages is trivial:

from __future__ import division  # For Python 2

def moving_averages(values, size):
    for selection in window(values, size):
        yield sum(selection) / size

Running this against your input (mapping the strings to integers) gives:

>>> y= ['1', '2', '3', '4','5','6','7','8','9','10']
>>> for avg in moving_averages(map(int, y), 5):
...     print(avg)
... 
3.0
4.0
5.0
6.0
7.0
8.0

To return None the first n - 1 iterations for 'incomplete' sets, just expand the moving_averages function a little:

def moving_averages(values, size):
    for _ in range(size - 1):
        yield None
    for selection in window(values, size):
        yield sum(selection) / size

0 讨论(0)

萌比男神i

2020-12-06 02:40

While I like Martijn's answer on this, like george, I was wondering if this wouldn't be faster by using a running summation instead of applying the sum() over and over again on mostly the same numbers.

Also the idea of having None values as default during the ramp up phase is interesting. In fact there may be plenty of different scenarios one could conceive for moving averages. Let's split the calculation of averages into three phases:

Ramp Up: Starting iterations where the current iteration count < window size
Steady Progress: We have exactly window size number of elements available to calculate a normal average := sum(x[iteration_counter-window_size:iteration_counter])/window_size
Ramp Down: At the end of the input data, we could return another window_size - 1 "average" numbers.

Here's a function that accepts

Arbitrary iterables (generators are fine) as input for data
Arbitrary window sizes >= 1
Parameters to switch on/off production of values during the phases for Ramp Up/Down
Callback functions for those phases to control how values are produced. This can be used to constantly provide a default (e.g. None) or to provide partial averages

Here's the code:

from collections import deque 

def moving_averages(data, size, rampUp=True, rampDown=True):
    """Slide a window of <size> elements over <data> to calc an average

    First and last <size-1> iterations when window is not yet completely
    filled with data, or the window empties due to exhausted <data>, the
    average is computed with just the available data (but still divided
    by <size>).
    Set rampUp/rampDown to False in order to not provide any values during
    those start and end <size-1> iterations.
    Set rampUp/rampDown to functions to provide arbitrary partial average
    numbers during those phases. The callback will get the currently
    available input data in a deque. Do not modify that data.
    """
    d = deque()
    running_sum = 0.0

    data = iter(data)
    # rampUp
    for count in range(1, size):
        try:
            val = next(data)
        except StopIteration:
            break
        running_sum += val
        d.append(val)
        #print("up: running sum:" + str(running_sum) + "  count: " + str(count) + "  deque: " + str(d))
        if rampUp:
            if callable(rampUp):
                yield rampUp(d)
            else:
                yield running_sum / size

    # steady
    exhausted_early = True
    for val in data:
        exhausted_early = False
        running_sum += val
        #print("st: running sum:" + str(running_sum) + "  deque: " + str(d))
        yield running_sum / size
        d.append(val)
        running_sum -= d.popleft()

    # rampDown
    if rampDown:
        if exhausted_early:
            running_sum -= d.popleft()
        for (count) in range(min(len(d), size-1), 0, -1):
            #print("dn: running sum:" + str(running_sum) + "  deque: " + str(d))
            if callable(rampDown):
                yield rampDown(d)
            else:
                yield running_sum / size
            running_sum -= d.popleft()

It seems to be a bit faster than Martijn's version - which is far more elegant, though. Here's the test code:

print("")
print("Timeit")
print("-" * 80)

from itertools import islice
def window(seq, n=2):
    "Returns a sliding window (of width n) over data from the iterable"
    "   s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ...                   "
    it = iter(seq)
    result = tuple(islice(it, n))
    if len(result) == n:
        yield result    
    for elem in it:
        result = result[1:] + (elem,)
        yield result

# Martijn's version:
def moving_averages_SO(values, size):
    for selection in window(values, size):
        yield sum(selection) / size


import timeit
problems = [int(i) for i in (10, 100, 1000, 10000, 1e5, 1e6, 1e7)]
for problem_size in problems:
    print("{:12s}".format(str(problem_size)), end="")

    so = timeit.repeat("list(moving_averages_SO(range("+str(problem_size)+"), 5))", number=1*max(problems)//problem_size,
                       setup="from __main__ import moving_averages_SO")
    print("{:12.3f} ".format(min(so)), end="")

    my = timeit.repeat("list(moving_averages(range("+str(problem_size)+"), 5, False, False))", number=1*max(problems)//problem_size,
                       setup="from __main__ import moving_averages")
    print("{:12.3f} ".format(min(my)), end="")

    print("")

And the output:

Timeit
--------------------------------------------------------------------------------
10                 7.242        7.656 
100                5.816        5.500 
1000               5.787        5.244 
10000              5.782        5.180 
100000             5.746        5.137 
1000000            5.745        5.198 
10000000           5.764        5.186

The original question can now be solved with this function call:

print(list(moving_averages(range(1,11), 5,
                           rampUp=lambda _: None,
                           rampDown=False)))

The output:

[None, None, None, None, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]

0 讨论(0)

独厮守ぢ

2020-12-06 02:44

There is another solution extending an itertools recipe pairwise(). You can extend this to nwise(), which gives you the sliding window (and works if the iterable is a generator):

def nwise(iterable, n):
    ts = it.tee(iterable, n)
    for c, t in enumerate(ts):
        next(it.islice(t, c, c), None)
    return zip(*ts)

def moving_averages_nw(iterable, n):
    yield from (sum(x)/n for x in nwise(iterable, n))

>>> list(moving_averages_nw(range(1, 11), 5))
[3.0, 4.0, 5.0, 6.0, 7.0, 8.0]

While a relatively high setup cost for short iterables this cost reduces in impact the longer the data set. This uses sum() but the code is reasonably elegant:

Timeit              MP           cfi         *****
--------------------------------------------------------------------------------
10                 4.658        4.959        7.351 
100                5.144        4.070        4.234 
1000               5.312        4.020        3.977 
10000              5.317        4.031        3.966 
100000             5.508        4.115        4.087 
1000000            5.526        4.263        4.202 
10000000           5.632        4.326        4.242

0 讨论(0)

梦毁少年i

2020-12-06 02:47

An approach that avoids recomputing intermediate sums..

list=range(0,12)
def runs(v):
 global runningsum
 runningsum+=v
 return(runningsum)
runningsum=0
runsumlist=[ runs(v) for v in list ]
result = [ (runsumlist[k] - runsumlist[k-5])/5 for k in range(0,len(list)+1)]

print result

[2,3,4,5,6,7,8,9]

make that runs(int(v)) .. then .. repr( runsumlist[k] - runsumlist[k-5])/5 ) if you ant to carry around numbers a strings..

Alt without the global:

list = [float[x] for x in range(0,12)]
nave = 5
movingave = sum(list[:nave]/nave)
for i in range(len(list)-nave):movingave.append(movingave[-1]+(list[i+nave]-list[i])/nave)
print movingave

be sure to do floating math even if you input values are integers

[2.0,3.0,4.0,5.0,6.0,7.0,8.0,9,0]

0 讨论(0)