groupby cumulative in pandas then update using numpy based specific condition

拜拜、爱过 提交于 2020-08-10 23:04:21

问题


I have a data frame as shown below.

B_ID   No_Show   Session  slot_num   Patient_count
    1     0.4       S1        1          1
    2     0.3       S1        2          1
    3     0.8       S1        3          1
    4     0.3       S1        3          2
    5     0.6       S1        4          1
    6     0.8       S1        5          1
    7     0.9       S1        5          2
    8     0.4       S1        5          3
    9     0.6       S1        5          4
    12    0.9       S2        1          1
    13    0.5       S2        1          2
    14    0.3       S2        2          1
    15    0.7       S2        3          1
    20    0.7       S2        4          1
    16    0.6       S2        5          1
    17    0.8       S2        5          2
    19    0.3       S2        5          3

From the above I would like to find the cumulative No_show by Session

df['Cum_No_show'] = df.groupby(['Session'])['No_Show'].cumsum()

No we get

B_ID   No_Show   Session  slot_num   Patient_count  Cumulative_No_show
    1     0.4       S1        1          1          0.4
    2     0.3       S1        2          1          0.7
    3     0.8       S1        3          1          1.5
    4     0.3       S1        3          2          1.8
    5     0.6       S1        4          1          2.4
    6     0.8       S1        5          1          3.2
    7     0.9       S1        5          2          4.1
    8     0.4       S1        5          3          4.5
    9     0.6       S1        5          4          5.1
    12    0.9       S2        1          1          0.9
    13    0.5       S2        1          2          1.4
    14    0.3       S2        2          1          1.7
    15    0.7       S2        3          1          2.4
    20    0.7       S2        4          1          3.1
    16    0.6       S2        5          1          3.7
    17    0.8       S2        5          2          4.5
    19    0.3       S2        5          3          4.8

From the above I would like create a new column named as below

U_slot_num = Updated slot number

U_No_show = Updated cumulative no show

Whenever cumulative no show > 0.6 change the next slot_num as same as current one and increase patient count by one and update U_No_show as subtracting 1 as shown in expected output.

Expected output:

No_Show  Session slot_num Patient_count Cum_No_show U_slot_num  U_No_show
 0.4       S1        1          1          0.4         1         0.4
 0.3       S1        2          1          0.7         2         0.7
 0.8       S1        3          1          1.5         2         0.5
 0.3       S1        3          2          1.8         3         0.8      
 0.6       S1        4          1          2.4         3         0.4
 0.8       S1        5          1          3.2         4         1.2
 0.9       S1        5          2          4.1         4         0.2
 0.4       S1        5          3          4.5         5         0.6
 0.6       S1        5          4          5.1         6         1.2
 0.9       S2        1          1          0.9         1         0.9
 0.5       S2        1          2          1.4         1         0.4
 0.3       S2        2          1          1.7         2         0.7
 0.7       S2        3          1          2.4         2         0.4
 0.7       S2        4          1          3.1         3         1.1
 0.6       S2        5          1          3.7         3         0.7
 0.8       S2        5          2          4.5         3         0.5
 0.3       S2        5          3          4.8         4         0.8

来源:https://stackoverflow.com/questions/61346334/groupby-cumulative-in-pandas-then-update-using-numpy-based-specific-condition

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!