问题
I am trying to make python run standard deviation functions faster with numba and numpy. However the problem is the for loop is very slow and I need alternatives so that I could make the code much faster. I iterated numba to the already existing numpy version however there is not much of a gain in performance. My original list_ has million of values within it thus it is taking a very long time to compute the standard deviation function. The list_ function down below is a very short numpy array that is meant to be an example for my problem as I wont be able to post the original list numbers. The for loop in the function below calculates the standard deviation of every nth number defined by the variable number in the list_ below. How would I be able to make this current function run faster.
import numpy as np
from numba import njit,jit,vectorize
number = 5
list_= np.array([457.334015,424.440002,394.795990,408.903992,398.821014,402.152008,435.790985,423.204987,411.574005,
404.424988,399.519989,377.181000,375.467010,386.944000,383.614990,375.071991,359.511993,328.865997,
320.510010,330.079010,336.187012,352.940002,365.026001,361.562012,362.299011,378.549011,390.414001,
400.869995,394.773010,382.556000])
Normal code:
def std_():
    std = np.array([list_[i:i+number].std() for i in range(0, len(list_)-number)])
    print(std)
std_()
Numba Code:
jitted_func = njit()(std_)
jitted_func()
performance results:
回答1:
You can do this in a vectorised fashion.
def rolling_window(a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
def std_():
    std = np.array([list_[i:i+number].std() for i in range(0, len(list_)-number)])
    return std
std1 = np.std(rolling_window(list_, 5), axis=1)
print(np.allclose(std1[:-1], std_()))
Gives True. The code for rolling_window has been taken from this answer.
Comparison with numba -
import numpy as np
from numba import njit,jit,vectorize
number = 5
list_= np.random.rand(10000)
def rolling_window(a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
def std_():
    std = np.array([list_[i:i+number].std() for i in range(0, len(list_)-number)])
    return std
%timeit np.std(rolling_window(list_, 5), axis=1)
%%timeit
jitted_func = njit()(std_)
jitted_func()
Gives
499 µs ± 3.98 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
106 ms ± 2.87 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
来源:https://stackoverflow.com/questions/65757073/optimizing-calculations-with-numpy-and-numba-python