Adding total row to a pandas DataFrame with tuples inside

问题

Here is my previous question (that has been answered). It helped me for my initial problem but now I am stuck on another one.

I have this below pandas.DataFrame which I try to add total rows for each sub levels.

Level  Company  Item
1      X        a       (10, 20)
                b       (10, 20)
       Y        a       (10, 20)
                b       (10, 20)
                c       (10, 20)
2      X        a       (10, 20)
                b       (10, 20)
                c       (10, 20)
       Y        a       (10, 20)

I would like to get this :

Level  Company  Item
1      X        a       (10, 20)
                b       (10, 20)
                total   (20, 40)
       Y        a       (10, 20)
                b       (10, 20)
                c       (10, 20)
                total   (30, 60)
       total            (50, 100)
                total   (50, 100)
2      X        a       (10, 20)
                b       (10, 20)
                c       (10, 20)
                total   (30, 60)
       Y        a       (10, 20)
                total   (10, 20)
       total            (40, 80)
                total   (40, 80)

To get the dataframe :

level = list(map(int, list('111112222')))
company = list('XXYYYXXXY')
item = list('ababcabca')
value = [(10,20)]*9
col = ['Level', 'Company', 'Item', 'Value']
df = pd.DataFrame([level,company,item,value]).T
df.columns = col
df.groupby(['Level', 'Company', 'Item'])['Value'].sum()

But my result is :

Level  Company  Item
1      X        a       (10, 20)
                b       (10, 20)
       Y        a       (10, 20)
                b       (10, 20)
                c       (10, 20)
       total            (50, 100)
2      X        a       (10, 20)
                b       (10, 20)
                c       (10, 20)
       Y        a       (10, 20)
       total            (40, 80)

Using the below script:

def f(x):
    return tuple(sum(x) for x in zip(*filter(lambda x: type(x) == tuple, x)))
m=df.unstack(level=['Company','Item'])
m=m.assign(total=m.apply(f, axis=1))
m=m.stack(level='Company')
m=m.assign(total=m.apply(f))
m=m.stack(level='Item')
m

回答1:

Use:

#s=df.groupby(['Level', 'Company', 'Item'])['Value'].sum()

def GetTupleSum(x):
    return tuple(sum(y) for y in zip(*x.dropna()))

df= s.unstack('Item')
df['total']=df.apply(GetTupleSum,axis=1)
( df.unstack()
    .assign(total_company=df['total'].groupby(level=0).apply(GetTupleSum) )
    .stack(['Company','Item']) )

Output

Level  Company  Item         
1      X        a                 (10, 20)
                b                 (10, 20)
                total             (20, 40)
       Y        a                 (10, 20)
                b                 (10, 20)
                c                 (10, 20)
                total             (30, 60)
                total_company    (50, 100)
2      X        a                 (10, 20)
                b                 (10, 20)
                c                 (10, 20)
                total             (30, 60)
       Y        a                 (10, 20)
                total             (10, 20)
                total_company     (40, 80)
dtype: object

来源：https://stackoverflow.com/questions/59364298/adding-total-row-to-a-pandas-dataframe-with-tuples-inside

标签

python-3.x

pandas

group-by

pivot-table