country_name country_code val_code \ United States of America 231 1 United States of America 231 2 United States of America 231 3 United States of America 231 4 United States of America 231 5 y191 y192 y193 y194 y195 \ 47052179 43361966 42736682 43196916 41751928 1187385 1201557 1172941 1176366 1192173 28211467 27668273 29742374 27543836 28104317 179000 193000 233338 276639 249688 12613922 12864425 13240395 14106139 15642337
In the data frame above, I would like to compute for each row, the percentage of the total occupied by that val_code, resulting in foll. data frame.
I.e. Sum up each row and divide by total of all rows
country_name country_code val_code \ United States of America 231 1 United States of America 231 2 United States of America 231 3 United States of America 231 4 United States of America 231 5 perc 50.14947129 1.363631254 32.48344744 0.260213146 15.74323688
Right now, I am doing this, but it is not working
grp_df = df.groupby(['country_name', 'val_code']).agg() pct_df = grp_df.groupby(level=0).apply(lambda x: 100*x/float(x.sum()))