pandas add new row based on sum/difference of other rows

泄露秘密 提交于 2021-02-08 09:24:07

问题


df have

id    measure   t1  t2  t3
1     savings   1    2   5
1     income    10   15  14
1     misc       5    5   5
2     savings    3   6   12
2     income     4   20  80
2     misc       1   1    1

df want- add a new row to the measure for each id, called spend, calculated by subtracting measure=income - measure=savings, for each of the periods t1,t2,t3, for each id

id    measure   t1  t2  t3
1     savings   1    2   5
1     income    10   15  14
1     misc      5     5   5
1     spend     9    13  9
2     savings    3   6   12
2     income     4   20  80
2     misc       1    1   1
2     spend      1   14  68

Trying:

df.loc[df['Measure'] == 'spend'] =          
                        df.loc[df['Measure'] == 'income']-
                        (df.loc[df['Measure'] == 'savings'])

Failing because I am not incorporating groupby for desired outcome


回答1:


Here is one way using groupby diff

df1=df[df.measure.isin(['savings','spend'])].copy()

s=df1.groupby('id',sort=False).diff().dropna().assign(id=df.id.unique(),measure='spend')
df=df.append(s,sort=True).sort_values('id')
df
Out[276]: 
   id  measure    t1    t2    t3
0   1  savings   1.0   2.0   5.0
1   1   income  10.0  15.0  14.0
1   1    spend   9.0  13.0   9.0
2   2  savings   3.0   6.0  12.0
3   2   income   4.0  20.0  80.0
3   2    spend   1.0  14.0  68.0

Update

df1=df.copy()
df1.loc[df.measure.ne('income'),'t1':]*=-1
s=df1.groupby('id',sort=False).sum().assign(id=df.id.unique(),measure='spend')
df=df.append(s,sort=True).sort_values('id')


来源:https://stackoverflow.com/questions/57378381/pandas-add-new-row-based-on-sum-difference-of-other-rows

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!