Having Trouble with multiple “groupby” with a variable and a category (binned data)

断了今生、忘了曾经 提交于 2019-12-23 20:28:41

问题


df.dtypes

Close       float64
eqId          int64
date         object
IntDate       int64
expiry        int64
delta         int64
ivMid       float64
conf        float64
Skew        float64
psc         float64
vol_B      category
dtype: object

gb = df.groupby([df['vol_B'],df['expiry']])

gb.describe()

I get a long error message with the final line being

AttributeError: 'Categorical' object has no attribute 'flags'

When I perform a groupby on each of them separately they each (independently) work great, I just can not perform multiple groupby with one of the variables being a "bin."

Also, when I use 2 other variables I am able to perform multiple groupby &ndash I successfully performed this:

gb = df.groupby([df['delta'],df['expiry']])

回答1:


I was facing a similar issue as the OP and found this question while looking for solutions. A simple hack that worked for me after going through the pandas documentation for categorical variables was to change the type of the categorical variable before grouping.

Since vol_B is the categorical variable in your case, you should try the following

#Depending on the content of vol_B you can do astype(int) or astype(float) as well.
gb = df.groupby([df['vol_B'].astype(str), df['expiry']])

I haven't gone into the details of why this works and that doesn't but if I get into it, I will update the answer.



来源:https://stackoverflow.com/questions/30445044/having-trouble-with-multiple-groupby-with-a-variable-and-a-category-binned-da

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!