How do I create a sum row and sum column in pandas?

后端 未结 4 431
刺人心
刺人心 2020-12-08 06:55

I\'m going through the Khan Academy course on Statistics as a bit of a refresher from my college days, and as a way to get me up to speed on pandas & other scientific Py

相关标签:
4条回答
  • 2020-12-08 07:26

    The original data is:

    >>> df = pd.DataFrame(dict(Undergraduate=[240, 3760], Graduate=[60, 440]), index=["Straight A's", "Not"])
    >>> df
    Out: 
                  Graduate  Undergraduate
    Straight A's        60            240
    Not                440           3760
    

    You can only use df.T to achieve recreating this table:

    >>> df_new = df.T
    >>> df_new
    Out: 
                   Straight A's   Not
    Graduate                 60   440
    Undergraduate           240  3760
    

    After computing the Total by row and columns:

    >>> df_new.loc['Total',:]= df_new.sum(axis=0)
    >>> df_new.loc[:,'Total'] = df_new.sum(axis=1)
    >>> df_new
    Out: 
                   Straight A's     Not   Total
    Graduate               60.0   440.0   500.0
    Undergraduate         240.0  3760.0  4000.0
    Total                 300.0  4200.0  4500.0
    
    0 讨论(0)
  • 2020-12-08 07:28

    From the original data using crosstab, if just base on your input, you just need melt before crosstab

    s=df.reset_index().melt('index')
    pd.crosstab(index=s['index'],columns=s.variable,values=s.value,aggfunc='sum',margins=True)
    Out[33]: 
    variable      Graduate  Undergraduate   All
    index                                      
    Not                440           3760  4200
    Straight A's        60            240   300
    All                500           4000  4500
    

    Toy data

    df=pd.DataFrame({'c1':[1,2,2,3,4],'c2':[2,2,3,3,3],'c3':[1,2,3,4,5]}) 
    # before `agg`, I think your input is the result after `groupby` 
    df
    Out[37]: 
       c1  c2  c3
    0   1   2   1
    1   2   2   2
    2   2   3   3
    3   3   3   4
    4   4   3   5
    
    
    pd.crosstab(df.c1,df.c2,df.c3,aggfunc='sum',margins
    =True)
    Out[38]: 
    c2     2     3  All
    c1                 
    1    1.0   NaN    1
    2    2.0   3.0    5
    3    NaN   4.0    4
    4    NaN   5.0    5
    All  3.0  12.0   15
    
    0 讨论(0)
  • 2020-12-08 07:32

    append and assign

    The point of this answer is to provide an in line and not an in place solution.

    append

    I use append to stack a Series or DataFrame vertically. It also creates a copy so that I can continue to chain.

    assign

    I use assign to add a column. However, the DataFrame I'm working on is in the in between nether space. So I use a lambda in the assign argument which tells Pandas to apply it to the calling DataFrame.


    df.append(df.sum().rename('Total')).assign(Total=lambda d: d.sum(1))
    
                  Graduate  Undergraduate  Total
    Not                440           3760   4200
    Straight A's        60            240    300
    Total              500           4000   4500
    

    Fun alternative

    Uses drop with errors='ignore' to get rid of potentially pre-existing Total rows and columns.

    Also, still in line.

    def tc(d):
      return d.assign(Total=d.drop('Total', errors='ignore', axis=1).sum(1))
    
    df.pipe(tc).T.pipe(tc).T
    
                  Graduate  Undergraduate  Total
    Not                440           3760   4200
    Straight A's        60            240    300
    Total              500           4000   4500
    
    0 讨论(0)
  • 2020-12-08 07:40

    Or in two steps, using the .sum() function as you suggested (which might be a bit more readable as well):

    import pandas as pd
    
    df = pd.DataFrame( {"Undergraduate": {"Straight A's": 240, "Not": 3_760},"Graduate": {"Straight A's": 60, "Not": 440},})
    
    #Total sum per column: 
    df.loc['Total',:]= df.sum(axis=0)
    
    #Total sum per row: 
    df.loc[:,'Total'] = df.sum(axis=1)
    

    Output:

                  Graduate  Undergraduate  Total
    Not                440           3760   4200
    Straight A's        60            240    300
    Total              500           4000   4500
    
    0 讨论(0)
提交回复
热议问题