Replace values within a groupby based on multiple conditions

狂风中的少年 提交于 2021-01-28 11:34:43

问题


My question is related to this one but I'm still not seeing how I can apply the answer to my problem. I have a DataFrame like so:

df = pd.DataFrame({
    'date': ['2001-01-01', '2001-02-01', '2001-03-01', '2001-04-01', '2001-02-01', '2001-03-01', '2001-04-01'],
    'cohort': ['2001-01-01', '2001-01-01', '2001-01-01', '2001-01-01', '2001-02-01', '2001-02-01', '2001-02-01'],
    'val': [100, 101, 102, 101, 200, 201, 201]
})

df
    date        cohort      val
0   2001-01-01  2001-01-01  100
1   2001-02-01  2001-01-01  101
2   2001-03-01  2001-01-01  102
3   2001-04-01  2001-01-01  101
4   2001-02-01  2001-02-01  200
5   2001-03-01  2001-02-01  201
6   2001-04-01  2001-02-01  201

Grouping for each cohort, I want to replace the values of val with the maximum value of val, but only for observations where date is less than the date associated with the maximum value of val. So rows 0, 1, and 4 would be changed to look like this:

df #This is what I want my final df to look like 
    date        cohort      val
0   2001-01-01  2001-01-01  102
1   2001-02-01  2001-01-01  102
2   2001-03-01  2001-01-01  102
3   2001-04-01  2001-01-01  101
4   2001-02-01  2001-02-01  201
5   2001-03-01  2001-02-01  201
6   2001-04-01  2001-02-01  201

How can I do this without lots of loops?


回答1:


  1. Determine the maximum value of val PER GROUP of cohort
  2. Determine the maximum date associated with val
  3. Perform vectorised comparison and replacement with np.where

v = df.groupby('cohort').val.transform('max')
df['val'] = np.where(
    df.date <= df.set_index('cohort').val.idxmax(), v, df.val
)

df
    date        cohort      val
0   2001-01-01  2001-01-01  102
1   2001-02-01  2001-01-01  102
2   2001-03-01  2001-01-01  102
3   2001-04-01  2001-01-01  101
4   2001-02-01  2001-02-01  201
5   2001-03-01  2001-02-01  201
6   2001-04-01  2001-02-01  201


来源:https://stackoverflow.com/questions/50418372/replace-values-within-a-groupby-based-on-multiple-conditions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!