Pandas groupby/apply has different behaviour with int and string types

大憨熊 提交于 2019-12-04 17:04:43

The problem is simply that a function applied to a GroupBy should never try to change the dataframe it receives. It is implementation dependant whether it is a copy (that can safely be changed but changes will not be seen in original dataframe) or a view. The choice is done by pandas optimizer, and as a user, you should just know that it is forbidden.

The correct way is to force a copy:

def func2(x):
    x = x.copy()
    if x.iloc[0]['X'] == 'A':
        x['D'] = 'u'
    else:
        x['D'] = 'v'
    return x[['X', 'D']]

After that, df.groupby('X').apply(func2).reset_index(level=0, drop=True) gives as expected:

   X  D
0  A  u
1  A  u
2  A  u
3  A  u
4  B  v
5  B  v
6  B  v
7  B  v
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!