Filter rows based one column' value and calculate percentage of sum in Pandas

泪湿孤枕 提交于 2021-01-28 18:56:06

问题


Given a small dataset as follows:

   value  input
0      3      0
1      4      1
2      3     -1
3      2      1
4      3     -1
5      5      0
6      1      0
7      1      1
8      1      1

I have used the following code:

df['pct'] = df['value'] / df['value'].sum()

But I want to calculate pct by excluding input = -1, which means if input value is -1, then the correspondent values will not taken into account to sum up, neither necessary to calculate pct, for rows 2 and 4 at this case.

The expected result will like this:

   value  input   pct
0      3      0  0.18
1      4      1  0.24
2      3     -1   NaN
3      2      1  0.12
4      3     -1   NaN
5      5      0  0.29
6      1      0  0.06
7      1      1  0.06
8      1      1  0.06

How could I do that in Pandas? Thanks.


回答1:


You can sum not matched rows by missing values to Series s by Series.where and divide only rows not matched mask filtered by DataFrame.loc, last round by Series.round:

mask = df['input'] != -1
df.loc[mask, 'pct'] = (df.loc[mask, 'value'] / df['value'].where(mask).sum()).round(2)

print (df)
   value  input   pct
0      3      0  0.18
1      4      1  0.24
2      3     -1   NaN
3      2      1  0.12
4      3     -1   NaN
5      5      0  0.29
6      1      0  0.06
7      1      1  0.06
8      1      1  0.06

EDIT: If need replace missing values to 0 is possible use second argument in where for set values to 0, this Series is possible also sum for same output like replace to missing values:

s = df['value'].where(df['input'] != -1, 0)
df['pct'] = (s / s.sum()).round(2)

print (df)
   value  input   pct
0      3      0  0.18
1      4      1  0.24
2      3     -1  0.00
3      2      1  0.12
4      3     -1  0.00
5      5      0  0.29
6      1      0  0.06
7      1      1  0.06
8      1      1  0.06


来源:https://stackoverflow.com/questions/62989116/filter-rows-based-one-column-value-and-calculate-percentage-of-sum-in-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!