Compute row percentages in pandas DataFrame?

前端 未结 2 530
傲寒
傲寒 2021-01-03 02:05

I have my data in a pandas DataFrame, and it looks like the following:

cat  val1   val2   val3   val4
A    7      10     0      19
B    10     2      1               


        
相关标签:
2条回答
  • 2021-01-03 02:34

    You can do this using apply:

    df[['val1', 'val2', 'val3', 'val4']] = df[['val1', 'val2', 'val3', 'val4']].apply(lambda x: x/x.sum(), axis=1)
    
    
    >>> df
      cat      val1      val2      val3      val4
    0   A  0.194444  0.277778  0.000000  0.527778
    1   B  0.370370  0.074074  0.037037  0.518519
    2   C  0.119048  0.357143  0.142857  0.380952
    
    0 讨论(0)
  • 2021-01-03 02:59

    div + sum

    For a vectorised solution, divide the dataframe along axis=0 by its sum over axis=1. You can use set_index + reset_index to ignore the identifier column.

    df = df.set_index('cat')
    res = df.div(df.sum(axis=1), axis=0)
    
    print(res.reset_index())
    
      cat      val1      val2      val3      val4
    0   A  0.194444  0.277778  0.000000  0.527778
    1   B  0.370370  0.074074  0.037037  0.518519
    2   C  0.119048  0.357143  0.142857  0.380952
    
    0 讨论(0)
提交回复
热议问题