Using conditional to generate new column in pandas dataframe

前端未结

关注

 5  1891

-上瘾入骨i 2020-11-29 02:26

I have a pandas dataframe that looks like this:

   portion  used
0        1   1.0
1        2   0.3
2        3   0.0
3        4   0.8

I\'d

5条回答

野趣味 (楼主)

2020-11-29 02:32

Alternatively you could do:

import pandas as pd
import numpy as np
df = pd.DataFrame(data={'portion':np.arange(10000), 'used':np.random.rand(10000)})

%%timeit
df.loc[df['used'] == 1.0, 'alert'] = 'Full'
df.loc[df['used'] == 0.0, 'alert'] = 'Empty'
df.loc[(df['used'] >0.0) & (df['used'] < 1.0), 'alert'] = 'Partial'

Which gives the same output but runs about 100 times faster on 10000 rows:

100 loops, best of 3: 2.91 ms per loop

Then using apply:

%timeit df['alert'] = df.apply(alert, axis=1)

1 loops, best of 3: 287 ms per loop

I guess the choice depends on how big is your dataframe.

0 讨论(0)

查看其它5个回答