def stack_plot(data, xtick, col2=\'project_is_approved\', col3=\'total\'):
ind = np.arange(data.shape[0])
plt.figure(figsize=(20,5))
p1 = plt.bar(ind, d
Sometimes it's convenient to keep an aggdict
of how each column should be transformed under aggregation that will work with different column sets and different group by columns. You can do this with the new syntax fairly easily by unpacking the dict with **. Here's a minimal working example for simple data.
dfx=pd.DataFrame(columns=["A","B","C"],data=np.random.randint(0,5,size=(10,3)))
#dfx
#
# A B C
#0 4 4 1
#1 2 4 4
#2 1 3 3
#3 2 4 3
#4 1 2 1
#5 0 4 2
#6 2 3 4
#7 1 0 2
#8 2 1 4
#9 3 0 3
Maybe when you agg you want the first "A"
, the last "B"
, the mean "C"
and sometimes your pipeline has a "D"
(but not this time) that you also want the mean of.
aggdict = {"A":lambda x: x.iloc[0], "B": lambda x: x.iloc[-1], "C" : "mean" , "D":lambda x: "mean"}
You can build a simple dict
like the old days and then unpack it with ** filtering on the relevant keys:
gb_col="C"
gbc = dfx.groupby(gb_col).agg(**{k:(k,v) for k,v in aggdict.items() if k in dfx.columns and k != gb_col})
# A B
#C
#1 4 2
#2 0 0
#3 1 4
#4 2 3
And then you can slice and dice how you want with the same syntax:
mygb = lambda gb_col: dfx.groupby(gb_col).agg(**{k:(k,v) for k,v in aggdict.items() if k in dfx.columns and k != gb_col})
allgb = [mygb(c) for c in dfx.columns]