How to optimize MAPE code in Python?

问题

I need to have a MAPE function, however I was not able to find it in standard packages ... Below, my implementation of this function.

def mape(actual, predict): 
    tmp, n = 0.0, 0
    for i in range(0, len(actual)):
        if actual[i] <> 0:
            tmp += math.fabs(actual[i]-predict[i])/actual[i]
            n += 1
    return (tmp/n)

I don't like it, it's super not optimal in terms of speed. How to rewrite the code to be more Pythonic way and boost the speed?

回答1:

Here's one vectorized approach with masking -

def mape_vectorized(a, b): 
    mask = a <> 0
    return (np.fabs(a[mask] - b[mask])/a[mask]).mean()

Probably a faster one with masking after division computation -

def mape_vectorized_v2(a, b): 
    mask = a <> 0
    return (np.fabs(a - b)/a)[mask].mean()

Runtime test -

In [217]: a = np.random.randint(-10,10,(10000))
     ...: b = np.random.randint(-10,10,(10000))
     ...: 

In [218]: %timeit mape(a,b)
100 loops, best of 3: 11.7 ms per loop

In [219]: %timeit mape_vectorized(a,b)
1000 loops, best of 3: 273 µs per loop

In [220]: %timeit mape_vectorized_v2(a,b)
1000 loops, best of 3: 220 µs per loop

来源：https://stackoverflow.com/questions/42250958/how-to-optimize-mape-code-in-python

标签

python

numpy

machine-learning

statistics

data-science

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!