Pandas percentage of total with groupby

前端 未结 14 2471
没有蜡笔的小新
没有蜡笔的小新 2020-11-22 06:41

This is obviously simple, but as a numpy newbe I\'m getting stuck.

I have a CSV file that contains 3 columns, the State, the Office ID, and the Sales for that office

14条回答
  •  执念已碎
    2020-11-22 07:14

    I realize there are already good answers here.

    I nevertheless would like to contribute my own, because I feel for an elementary, simple question like this, there should be a short solution that is understandable at a glance.

    It should also work in a way that I can add the percentages as a new column, leaving the rest of the dataframe untouched. Last but not least, it should generalize in an obvious way to the case in which there is more than one grouping level (e.g., state and country instead of only state).

    The following snippet fulfills these criteria:

    df['sales_ratio'] = df.groupby(['state'])['sales'].transform(lambda x: x/x.sum())
    

    Note that if you're still using Python 2, you'll have to replace the x in the denominator of the lambda term by float(x).

提交回复
热议问题