问题
My simplified dataframe is as follows:
df = pd.DataFrame()
df['A'] = ('IGNORE','IGNORE','IGNORE','YES','IGNORE','YES','YES','YES','IGNORE','IGNORE','IGNORE','YES','IGNORE','IGNORE','IGNORE','IGNORE','IGNORE','IGNORE','IGNORE','IGNORE','IGNORE', 'NO','IGNORE','IGNORE','IGNORE','IGNORE')
I need to reverse dataframe (which I know I can do via df = df[::-1]) then make column B as follows.
- if 'YES' occurs then following rows result in 'GOOD' until a 'YES' or 'NO' occurs again and via versa for 'NO' occurring except 'BAD' will replace 'GOOD'
Desire output is as follows:
df['B'] = ('GOOD','GOOD','GOOD','YES','IGNORE','YES','YES','YES','GOOD','GOOD','GOOD','YES','BAD','BAD','BAD','BAD','BAD','BAD','BAD','BAD','BAD', 'NO','IGNORE','IGNORE','IGNORE','IGNORE')
回答1:
Idea is use Series.map dy dictioanry first with back filling missing values and replace last group by fillna to Series, which is used for replace IGNORE consecutive values - 2 or more:
s = df['A'].map({'IGNORE': np.nan, 'YES':'GOOD', 'NO':'BAD'}).bfill().fillna(df['A'])
m1 = df.groupby(df['A'].ne(df['A'].shift()).cumsum())['A'].transform('size').ne(1)
m2 = df['A'].eq('IGNORE')
df['C'] = np.where(m1 & m2, s, df['A'])
print(df)
A B C
0 IGNORE GOOD GOOD
1 IGNORE GOOD GOOD
2 IGNORE GOOD GOOD
3 YES YES YES
4 IGNORE IGNORE IGNORE
5 YES YES YES
6 YES YES YES
7 YES YES YES
8 IGNORE GOOD GOOD
9 IGNORE GOOD GOOD
10 IGNORE GOOD GOOD
11 YES YES YES
12 IGNORE BAD BAD
13 IGNORE BAD BAD
14 IGNORE BAD BAD
15 IGNORE BAD BAD
16 IGNORE BAD BAD
17 IGNORE BAD BAD
18 IGNORE BAD BAD
19 IGNORE BAD BAD
20 IGNORE BAD BAD
21 NO NO NO
22 IGNORE IGNORE IGNORE
23 IGNORE IGNORE IGNORE
24 IGNORE IGNORE IGNORE
25 IGNORE IGNORE IGNORE
来源:https://stackoverflow.com/questions/56772211/based-on-dataframe-column-result-all-following-rows-equal-a-repetitive-value-unt