Pandas vectorised function cumsum versus numpy

后端 未结 2 954
春和景丽
春和景丽 2021-01-03 07:30

While answering the question Vectorize calculation of a Pandas Dataframe, I noticed an interesting issue regarding performance.

I was under the impression that funct

2条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-01-03 07:35

    Pandas can deal with NaN, you can check the difference by:

    a = np.random.randn(1000000)
    %timeit np.nancumsum(a)
    %timeit np.cumsum(a)
    

    outputs:

    9.02 ms ± 189 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    4.37 ms ± 18.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    

提交回复
热议问题