I have a pandas dataframe that looks like this:
portion used
0 1 1.0
1 2 0.3
2 3 0.0
3 4 0.8
I\'d
Alternatively you could do:
import pandas as pd
import numpy as np
df = pd.DataFrame(data={'portion':np.arange(10000), 'used':np.random.rand(10000)})
%%timeit
df.loc[df['used'] == 1.0, 'alert'] = 'Full'
df.loc[df['used'] == 0.0, 'alert'] = 'Empty'
df.loc[(df['used'] >0.0) & (df['used'] < 1.0), 'alert'] = 'Partial'
Which gives the same output but runs about 100 times faster on 10000 rows:
100 loops, best of 3: 2.91 ms per loop
Then using apply:
%timeit df['alert'] = df.apply(alert, axis=1)
1 loops, best of 3: 287 ms per loop
I guess the choice depends on how big is your dataframe.