I have my data in a pandas DataFrame, and it looks like the following:
cat val1 val2 val3 val4
A 7 10 0 19
B 10 2 1
You can do this using apply
:
df[['val1', 'val2', 'val3', 'val4']] = df[['val1', 'val2', 'val3', 'val4']].apply(lambda x: x/x.sum(), axis=1)
>>> df
cat val1 val2 val3 val4
0 A 0.194444 0.277778 0.000000 0.527778
1 B 0.370370 0.074074 0.037037 0.518519
2 C 0.119048 0.357143 0.142857 0.380952
For a vectorised solution, divide the dataframe along axis=0
by its sum over axis=1
. You can use set_index
+ reset_index
to ignore the identifier column.
df = df.set_index('cat')
res = df.div(df.sum(axis=1), axis=0)
print(res.reset_index())
cat val1 val2 val3 val4
0 A 0.194444 0.277778 0.000000 0.527778
1 B 0.370370 0.074074 0.037037 0.518519
2 C 0.119048 0.357143 0.142857 0.380952