How to make a pandas crosstab with percentages?

前端 未结 6 700
时光说笑
时光说笑 2021-01-30 01:44

Given a dataframe with different categorical variables, how do I return a cross-tabulation with percentages instead of frequencies?

df = pd.DataFrame({\'A\' : [\         


        
6条回答
  •  深忆病人
    2021-01-30 02:22

    Another option is to use div rather than apply:

    In [11]: res = pd.crosstab(df.A, df.B)
    

    Divide by the sum over the index:

    In [12]: res.sum(axis=1)
    Out[12]: 
    A
    one      12
    three     6
    two       6
    dtype: int64
    

    Similar to above, you need to do something about integer division (I use astype('float')):

    In [13]: res.astype('float').div(res.sum(axis=1), axis=0)
    Out[13]: 
    B             A         B         C
    A                                  
    one    0.333333  0.333333  0.333333
    three  0.333333  0.333333  0.333333
    two    0.333333  0.333333  0.333333
    

提交回复
热议问题