Using conditional to generate new column in pandas dataframe

前端 未结 5 1891
-上瘾入骨i
-上瘾入骨i 2020-11-29 02:26

I have a pandas dataframe that looks like this:

   portion  used
0        1   1.0
1        2   0.3
2        3   0.0
3        4   0.8

I\'d

5条回答
  •  野趣味
    野趣味 (楼主)
    2020-11-29 02:32

    Alternatively you could do:

    import pandas as pd
    import numpy as np
    df = pd.DataFrame(data={'portion':np.arange(10000), 'used':np.random.rand(10000)})
    
    %%timeit
    df.loc[df['used'] == 1.0, 'alert'] = 'Full'
    df.loc[df['used'] == 0.0, 'alert'] = 'Empty'
    df.loc[(df['used'] >0.0) & (df['used'] < 1.0), 'alert'] = 'Partial'
    

    Which gives the same output but runs about 100 times faster on 10000 rows:

    100 loops, best of 3: 2.91 ms per loop
    

    Then using apply:

    %timeit df['alert'] = df.apply(alert, axis=1)
    
    1 loops, best of 3: 287 ms per loop
    

    I guess the choice depends on how big is your dataframe.

提交回复
热议问题