Why is np.where faster than pd.apply

前端 未结 2 745
时光说笑
时光说笑 2020-12-01 21:50

Sample code is here

import pandas as pd
import numpy as np

df = pd.DataFrame({\'Customer\' : [\'Bob\', \'Ken\', \'Steve\', \'Joe\'],
                   \'Sp         


        
2条回答
  •  遥遥无期
    2020-12-01 22:39

    Just adding a visualization approach to what have been said.

    Profile and total cumulative time of df.apply :

    We can see that the cimulative time is 13.8s.

    Profile and total cumulative time of np.where :

    Here, the cumulative time is 5.44ms which is 2500 times faster than df.apply

    The figure above were obtained using the library snakeviz. Here is a link to the library.

    SnakeViz displays profiles as a sunburst in which functions are represented as arcs. A root function is a circle at the middle, with functions it calls around, then the functions those functions call, and so on. The amount of time spent inside a function is represented by the angular width of the arc. An arc that wraps most of the way around the circle represents a function that is taking up most of the time of its calling function, while a skinny arc represents a function that is using hardly any time at all.

提交回复
热议问题