Adding total row to pandas DataFrame groupby

♀尐吖头ヾ 提交于 2019-12-24 18:13:25

问题


I am aware of this link but I didn't manage to solve my problem.

I have this below DataFrame from pandas.DataFrame.groupby().sum():

                                                          Value
Level      Company         Item
    1            X            a                             100
                              b                             200
                 Y            a                              35
                              b                             150
                              c                              35
    2            X            a                              48
                              b                             100
                              c                              50
                 Y            a                              80

and would like to add total rows for each level of index that I have to get:

                                                          Value
Level      Company         Item
    1            X            a                             100
                              b                             200
                          Total                             300
                 Y            a                              35
                              b                             150
                              c                              35
                          Total                             520
             Total                                          820
    2            X            a                              48
                              b                             100
                              c                              50
             Total                                          198
                 Y            a                              80
                          Total                              80
               Total                                        278
Total                                                      1098

As request

level = list(map(int, list('111112222')))
company = list('XXYYYXXXY')
item = list('ababcabca')
value = [100,200,35,150,35,48,100,50,80]
col = ['Level', 'Company', 'Item', 'Value']
df = pd.DataFrame([level,company,item,value]).T
df.columns = col
df.groupby(['Level', 'Company', 'Item']).sum()

回答1:


You can try stacking it one level at a time:

m = df.groupby(['Level','Company','Item'])['Value'].sum().unstack(level=['Company','Item'])
m = m.assign(total=m.sum(1))
m = m.stack(level='Company')
m = m.assign(total=m.sum(1))
m = m.stack(level='Item')

The output has duplicate totals though:

Level  Company  Item 
1      X        a        100.0
                b        200.0
                total    300.0
       Y        a         35.0
                b        150.0
                c         35.0
                total    220.0
       total             520.0
                total    520.0
2      X        a         48.0
                b        100.0
                c         50.0
                total    198.0
       Y        a         80.0
                total     80.0
       total             278.0
                total    278.0
dtype: float64



回答2:


You can use:

m=df.groupby(['Level','Company','Item'])['Value'].sum().unstack()
m.assign(total=m.sum(1)).stack().to_frame('Value')

                     Value
Level Company Item        
1     X       a      100.0
              b      200.0
              total  300.0
      Y       a       35.0
              b      150.0
              c       35.0
              total  220.0
2     X       a       48.0
              b      100.0
              c       50.0
              total  198.0
      Y       a       80.0
              total   80.0



回答3:


Try this: Basically, this is creating two new dfs from the using the sum of the two groups and concating the three data frames

level = list(map(int, list('111112222')))
company = list('XXYYYXXXY')
item = list('ababcabca')
value = [100,200,35,150,35,48,100,50,80]
col = ['Level', 'Company', 'Item', 'Value']
df = pd.DataFrame([level,company,item,value]).T
df.columns = col

df1 = (df.groupby(['Level', 'Company', 'Item'])['Value'].sum())
df2 = (df1.sum(level=0).to_frame().assign(Company='total').set_index('Company', append=True))
df3 = (df1.groupby(['Level','Company']).sum().to_frame().assign(Item='total').set_index('Item', append=True))

dfx = pd.concat([df1.to_frame().reset_index(),
                 df2.reset_index(),
                 df3.reset_index()],sort=False)
print(dfx)

Output:

   Level Company   Item  Value
0      1       X      a    100
1      1       X      b    200
2      1       Y      a     35
3      1       Y      b    150
4      1       Y      c     35
5      2       X      a     48
6      2       X      b    100
7      2       X      c     50
8      2       Y      a     80
0      1   total    NaN    520
1      2   total    NaN    278
0      1       X  total    300
1      1       Y  total    220
2      2       X  total    198
3      2       Y  total     80

This is not sorted though as you expect. If I concat the 3 dfs without resetting the index I'm getting the expected sort order, but the index is a multi-index column

dfx = pd.concat([df1.to_frame(), df2, df3]).sort_index()

Output

               Value
(1, X, a)        100
(1, X, b)        200
(1, X, total)    300
(1, Y, a)         35
(1, Y, b)        150
(1, Y, c)         35
(1, Y, total)    220
(1, total)       520
(2, X, a)         48
(2, X, b)        100
(2, X, c)         50
(2, X, total)    198
(2, Y, a)         80
(2, Y, total)     80
(2, total)       278

I am not sure how to convert this to columns as in your df.



来源:https://stackoverflow.com/questions/59326903/adding-total-row-to-pandas-dataframe-groupby

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!