How do I create a sum row and sum column in pandas?

后端未结

关注

 4  454

I\'m going through the Khan Academy course on Statistics as a bit of a refresher from my college days, and as a way to get me up to speed on pandas & other scientific Py

相关标签:

4条回答

轻奢々

2020-12-08 07:26

The original data is:

>>> df = pd.DataFrame(dict(Undergraduate=[240, 3760], Graduate=[60, 440]), index=["Straight A's", "Not"])
>>> df
Out: 
              Graduate  Undergraduate
Straight A's        60            240
Not                440           3760

You can only use df.T to achieve recreating this table:

>>> df_new = df.T
>>> df_new
Out: 
               Straight A's   Not
Graduate                 60   440
Undergraduate           240  3760

After computing the Total by row and columns:

>>> df_new.loc['Total',:]= df_new.sum(axis=0)
>>> df_new.loc[:,'Total'] = df_new.sum(axis=1)
>>> df_new
Out: 
               Straight A's     Not   Total
Graduate               60.0   440.0   500.0
Undergraduate         240.0  3760.0  4000.0
Total                 300.0  4200.0  4500.0

0 讨论(0)

自闭症患者

2020-12-08 07:28

From the original data using crosstab, if just base on your input, you just need melt before crosstab

s=df.reset_index().melt('index')
pd.crosstab(index=s['index'],columns=s.variable,values=s.value,aggfunc='sum',margins=True)
Out[33]: 
variable      Graduate  Undergraduate   All
index                                      
Not                440           3760  4200
Straight A's        60            240   300
All                500           4000  4500

Toy data

df=pd.DataFrame({'c1':[1,2,2,3,4],'c2':[2,2,3,3,3],'c3':[1,2,3,4,5]}) 
# before `agg`, I think your input is the result after `groupby` 
df
Out[37]: 
   c1  c2  c3
0   1   2   1
1   2   2   2
2   2   3   3
3   3   3   4
4   4   3   5


pd.crosstab(df.c1,df.c2,df.c3,aggfunc='sum',margins
=True)
Out[38]: 
c2     2     3  All
c1                 
1    1.0   NaN    1
2    2.0   3.0    5
3    NaN   4.0    4
4    NaN   5.0    5
All  3.0  12.0   15

0 讨论(0)

轮回少年

2020-12-08 07:32
append and assign

The point of this answer is to provide an in line and not an in place solution.

append

I use append to stack a Series or DataFrame vertically. It also creates a copy so that I can continue to chain.

assign

I use assign to add a column. However, the DataFrame I'm working on is in the in between nether space. So I use a lambda in the assign argument which tells Pandas to apply it to the calling DataFrame.
```
df.append(df.sum().rename('Total')).assign(Total=lambda d: d.sum(1))

              Graduate  Undergraduate  Total
Not                440           3760   4200
Straight A's        60            240    300
Total              500           4000   4500
```
Fun alternative

Uses drop with errors='ignore' to get rid of potentially pre-existing Total rows and columns.

Also, still in line.
```
def tc(d):
  return d.assign(Total=d.drop('Total', errors='ignore', axis=1).sum(1))

df.pipe(tc).T.pipe(tc).T

              Graduate  Undergraduate  Total
Not                440           3760   4200
Straight A's        60            240    300
Total              500           4000   4500
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

隐瞒了意图╮

2020-12-08 07:40

Or in two steps, using the .sum() function as you suggested (which might be a bit more readable as well):

import pandas as pd

df = pd.DataFrame( {"Undergraduate": {"Straight A's": 240, "Not": 3_760},"Graduate": {"Straight A's": 60, "Not": 440},})

#Total sum per column: 
df.loc['Total',:]= df.sum(axis=0)

#Total sum per row: 
df.loc[:,'Total'] = df.sum(axis=1)

Output:

              Graduate  Undergraduate  Total
Not                440           3760   4200
Straight A's        60            240    300
Total              500           4000   4500

0 讨论(0)

How do I create a sum row and sum column in pandas?

append and assign

append

assign

Fun alternative