问题
I have a pandas DataFrame like:
pet treats lbs
0 cat 2 5.0
1 dog 1 9.9
2 snek 3 1.1
3 cat 6 4.5
4 dog 1 9.4
I would like to add a fourth column that takes each treat as a percentage of the total treats for pets of that kind. So, the treat value in row 0, divided by the sum of all treats for pets matching "cat" (and so on for each row).
In Excel, I think I would do something like this:
A B C D
1 cat 2 5.0 =B1/SUMIF(A:A,A1,B:B)
2 dog 1 9.9 =B2/SUMIF(A:A,A2,B:B)
3 snek 3 1.1 =B3/SUMIF(A:A,A3,B:B)
4 cat 6 4.5 =B4/SUMIF(A:A,A4,B:B)
5 dog 1 9.4 =B5/SUMIF(A:A,A5,B:B)
Anyone have an idea how I could add this "treat_percent" column using pandas?
pet treats lbs treat_percent
0 cat 2 5.0 33.33
1 dog 1 9.9 50.00
2 snek 3 1.1 100.00
3 cat 6 4.5 66.67
4 dog 1 9.4 50.00
So far, I have tried:
df['treat_percent'] = df['pet'] / df.groupby('pet')['treats'].sum()
and
df['treat_percent'] = df['pet'] / df.loc[df['pet'] == df['pet'], 'treats'].sum()
回答1:
You can using transform
df['treat_rate']=df.treats/df.groupby('pet').treats.transform('sum')
df
Out[153]:
pet treats lbs treat_rate
0 cat 2 5.0 0.25
1 dog 1 9.9 0.50
2 snek 3 1.1 1.00
3 cat 6 4.5 0.75
4 dog 1 9.4 0.50
来源:https://stackoverflow.com/questions/50053723/pandas-adding-an-excel-sumif-column-like-a1-sumifbb-b1-aa