问题
I am trying to build a table that has groups that are divided by subgroups with count and average for each subgroup. For example, I want to convert the following data frame:
To a table that looks like this where the interval is a bigger group and columns a thru i become subgroups within the group with the corresponding subgroups' count and average in each cell:
I have tried this with no success:
回答1:
Use DataFrame.melt with GroupBy.agg and tuples for aggregate functions with new columns names:
df1 = (df.melt('interval', var_name='source')
.groupby(['interval','source'])['value']
.agg([('cnt','count'), ('average','mean')])
.reset_index())
print (df1.head())
interval source cnt average
0 0 a 1 5.0
1 0 b 1 0.0
2 0 c 1 0.0
3 0 d 1 0.0
4 0 f 1 0.0
回答2:
Try.
df.groupby(['interval']).apply(lambda x : x.stack()
.groupby(level=-1)
.agg({'count', 'mean'}))
Use groupby with apply to apply a function for each group then stack and groupby again with agg to find count and mean.
回答3:
The following code solves the problem I asked for:
df.group(['interval'],,as_index=False).agg({
'a':{"count":"mean"},
'b':{"count":"mean"},
'c':{"count":"mean"},
'd':{"count":"mean"},
'f':{"count":"mean"},
'g':{"count":"mean"},
'i':{"count":"mean"}
})
来源:https://stackoverflow.com/questions/55663359/python-summarizing-aggregating-groups-and-sub-groups-in-dataframe