问题
I have a dataframe like this:
Bool Hour
0 False 12
1 False 24
2 False 12
3 False 24
4 True 12
5 False 24
6 False 12
7 False 24
8 False 12
9 False 24
10 False 12
11 True 24
and I would like to backfill the True value in 'Bool' column to the point when 'Hour' first reaches '12'. The result would be something like this:
Bool Hour Result
0 False 12 False
1 False 24 False
2 False 12 True <- desired backfill
3 False 24 True <- desired backfill
4 True 12 True
5 False 24 False
6 False 12 False
7 False 24 False
8 False 12 False
9 False 24 False
10 False 12 True <- desired backfill
11 True 24 True
Any help is greatly appreciated! Thank you very much!
回答1:
This is a little bit hard to achieve , here we can use groupby
with idxmax
s=(~df.Bool&df.Hour.eq(12)).iloc[::-1].groupby(df.Bool.iloc[::-1].cumsum()).transform('idxmax')
df['result']=df.index>=s.iloc[::-1]
df
Out[375]:
Bool Hour result
0 False 12 False
1 False 24 False
2 False 12 True
3 False 24 True
4 True 12 True
5 False 24 False
6 False 12 False
7 False 24 False
8 False 12 False
9 False 24 False
10 False 12 True
11 True 24 True
回答2:
IIUC, you can do:
s = df['Bool'].shift(-1)
df['Result'] = df['Bool'] | s.where(s).groupby(df['Hour'].eq(12).cumsum()).bfill()
Output:
Bool Hour Result
0 False 12 False
1 False 24 False
2 False 12 True
3 False 24 True
4 True 12 True
5 False 24 False
6 False 12 False
7 False 24 False
8 False 12 False
9 False 24 False
10 False 12 True
11 True 24 True
回答3:
create a groupID s
on consecutive False
and separate True
from them. Groupby on Hour
equals 12
by using s
. Use transform sum
and cumsum
to get the count of True
on 12
from bottom-up on each group and return True
on 0
and or
with values of Bool
s = df.Bool.ne(df.Bool.shift()).cumsum()
s1 = df.where(df.Bool).Bool.bfill()
g = df.Hour.eq(12).groupby(s)
df['bfill_Bool'] = (g.transform('sum') - g.cumsum()).eq(0) & s1 | df.Bool
Out[905]:
Bool Hour bfill_Bool
0 False 12 False
1 False 24 False
2 False 12 True
3 False 24 True
4 True 12 True
5 False 24 False
6 False 12 False
7 False 24 False
8 False 12 False
9 False 24 False
10 False 12 True
11 True 24 True
来源:https://stackoverflow.com/questions/58104114/python-pandas-dataframe-backfill-based-on-two-conditions