Calculating mean of a specific column by specific rows

南楼画角 提交于 2021-02-05 06:55:11

问题


I have a dataframe that looks like in the pictures.

Now, I want to add a new column that will show the average of power for each day (given the data is sampled every 5 minutes), but separately for when it is day_or_night (day = 0 in the column, night = 1). I've gotten this far:

train['avg_by_day'][train['day_or_night']==1] = train['power'][train['day_or_night']==1].mean() train['avg_by_day'][train['day_or_night']==0] = train['power'][train['day_or_night']==0].mean()

but this just adds the average of all the power values that correspond to day, or similarly - night, which isn't what I'm after: a specific average for each day/night separately.

I need something like: train['avg_by_day'] == train.power.mean() when day == 1 and day_or_night == 1, and this for each day.


回答1:


So you want to group the dataframe by day and day_or_night and create a new column with mean power values for each group:

train['avg_by_day'] = train.groupby(['day','day_or_night'])['power']\
                           .transform('mean')

Maybe you should also include year and month in the grouping columns because otherwise it's going to group the 1st day of every month together, same for the 2nd day and so on.



来源:https://stackoverflow.com/questions/43306199/calculating-mean-of-a-specific-column-by-specific-rows

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!