I am trying to create a boolean column using GroupBy.transform on a df like this,
id type
1 1.00000
1 1.00000
2 2.00000
2 3.00000
3 2.00000
the code is like,
df['has_two'] = df.groupby('id')['type'].transform(lambda x: x == 2)
but instead of boolean values, has_two has float values, e.g. 0.0. I am wondering why is that.
UPDATE
I created a test case,
df = pd.DataFrame({'id':['1', '1', '2', '2', '3'], 'type':[1.0, 1.0, 2.0, 1.0, 2.0]})
df['has_2'] = df.groupby('id')['type'].transform(lambda x: x == 2)
this gave me,
id type has_2
0 1 1.0 0.0
1 1 1.0 0.0
2 2 2.0 1.0
3 2 1.0 0.0
4 3 2.0 1.0
if I am using df['has_2'] = df['type'] == 2 as suggested by jezrael, it is fine,
id type has_2
0 1 1.0 False
1 1 1.0 False
2 2 2.0 True
3 2 1.0 False
4 3 2.0 True
I am using pandas==0.20.3 on Python 3.5.2. I am wondering what's going on, do I need an update on pandas or python 3?
UPDATE
Updated pandas to 0.22.0 fixed this issue.
For me it working nice, I get boolean column:
df['has_two'] = df.groupby('id')['type'].transform(lambda x: x == 2)
print (df)
id type has_two
0 1 1.0 False
1 1 1.0 False
2 2 2.0 True
3 2 3.0 False
4 3 2.0 True
But maybe is possible only compare column:
df['has_two'] = df['type'] == 2
print (df)
id type has_two
0 1 1.0 False
1 1 1.0 False
2 2 2.0 True
3 2 3.0 False
4 3 2.0 True
Use this line
df['has_two'] = df.groupby('id')['type'].transform(lambda x: x == 2) == 2
Worked for me :)
来源:https://stackoverflow.com/questions/48059985/pandas-create-boolean-column-using-groupby-transform