pandas create boolean column using groupby transform

感情迁移 提交于 2019-12-08 02:41:52

问题


I am trying to create a boolean column using GroupBy.transform on a df like this,

id    type
1     1.00000
1     1.00000
2     2.00000
2     3.00000
3     2.00000

the code is like,

df['has_two'] = df.groupby('id')['type'].transform(lambda x: x == 2)

but instead of boolean values, has_two has float values, e.g. 0.0. I am wondering why is that.

UPDATE

I created a test case,

df = pd.DataFrame({'id':['1', '1', '2', '2', '3'], 'type':[1.0, 1.0, 2.0, 1.0, 2.0]})
df['has_2'] = df.groupby('id')['type'].transform(lambda x: x == 2)

this gave me,

   id  type  has_2
0  1   1.0    0.0
1  1   1.0    0.0
2  2   2.0    1.0
3  2   1.0    0.0
4  3   2.0    1.0

if I am using df['has_2'] = df['type'] == 2 as suggested by jezrael, it is fine,

   id  type  has_2
0  1   1.0  False
1  1   1.0  False
2  2   2.0   True
3  2   1.0  False
4  3   2.0   True

I am using pandas==0.20.3 on Python 3.5.2. I am wondering what's going on, do I need an update on pandas or python 3?

UPDATE

Updated pandas to 0.22.0 fixed this issue.


回答1:


For me it working nice, I get boolean column:

df['has_two'] = df.groupby('id')['type'].transform(lambda x: x == 2)
print (df)
   id  type  has_two
0   1   1.0    False
1   1   1.0    False
2   2   2.0     True
3   2   3.0    False
4   3   2.0     True

But maybe is possible only compare column:

df['has_two'] = df['type'] == 2
print (df)
   id  type  has_two
0   1   1.0    False
1   1   1.0    False
2   2   2.0     True
3   2   3.0    False
4   3   2.0     True



回答2:


Use this line

df['has_two'] = df.groupby('id')['type'].transform(lambda x: x == 2) == 2

Worked for me :)



来源:https://stackoverflow.com/questions/48059985/pandas-create-boolean-column-using-groupby-transform

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!