I\'m trying to write a function to aggregate and perform various stats calcuations on a dataframe in Pandas and then merge it to the original dataframe however, I\'m running
By default, groupby output has the grouping columns as indicies, not columns, which is why the merge is failing.
There are a couple different ways to handle it, probably the easiest is using the as_index parameter when you define the groupby object.
po_grouped_df = poagg_df.groupby(['EID','PCODE'], as_index=False)
Then, your merge should work as expected.
In [356]: pd.merge(acc_df, pol_df, on=['EID','PCODE'], how='inner',suffixes=('_Acc','_Po'))
Out[356]:
EID PCODE SC_Acc EE_Acc SI_Acc PVALUE_Acc EE_Po PVALUE_Po \
0 123 GR 236 40000 1.805222e+31 350 10000 50
1 123 GR 236 40000 1.805222e+31 350 30000 300
2 123 GU 443 12000 8.765549e+87 250 10000 100
3 123 GU 443 12000 8.765549e+87 250 2000 150
SC_Po SI_Po
0 23 40
1 213 140
2 230 400
3 213 140