问题
I've asked this question before but the answer I got didn't quite work out as I thought it had, so that here I am.
Previous question: Defining a function for changing column values and creating new datasets
I am trying to define a function where it will take a dataframe and change values in a column to create multiple new dataframes.
As an example, from df1 looking like:
df1:
class colB colC
0 1 1b 1c
1 2 2b 2c
2 3 3b 3c
3 1 4b 4c
4 2 5b 5c
I am trying to create multiple binary classes to implement one-vs-all classification. So the function would create...
df2:
class colB colC
0 1 1b 1c
1 -1 2b 2c
2 -1 3b 3c
3 1 4b 4c
4 -1 5b 5c
df3:
class colB colC
0 -1 1b 1c
1 1 2b 2c
2 -1 3b 3c
3 -1 4b 4c
4 1 5b 5c
df4:
class colB colC
0 -1 1b 1c
1 -1 2b 2c
2 1 3b 3c
3 -1 4b 4c
4 -1 5b 5c
and so on. All the unique values are an incremental value ranging from 1 to 120.
The problem with the previous answer give (np.identity) was that it created dataframes taking every single value as either 1 or -1 instead of categorizing identical values as the same class accordingly.
Thanks
回答1:
A similar idea using np.where
and unique
(again renaming your class
column so it doesn't override a builtin name):
dfs = [
df1.assign(class_=np.where(df1['class_'].eq(i), 1, -1)) for i in df1['class_'].unique()
]
for d in dfs:
print(d, end='\n\n')
class_ colB colC
0 1 1b 1c
1 -1 2b 2c
2 -1 3b 3c
3 1 4b 4c
4 -1 5b 5c
class_ colB colC
0 -1 1b 1c
1 1 2b 2c
2 -1 3b 3c
3 -1 4b 4c
4 1 5b 5c
class_ colB colC
0 -1 1b 1c
1 -1 2b 2c
2 1 3b 3c
3 -1 4b 4c
4 -1 5b 5c
回答2:
In similar vein to @user3483203, but using mask
and fillna
:
[df.assign(**{'class' : df['class'].mask(df['class'].ne(cls)).fillna(-1)})
for cls in df['class'].unique()
]
[ class colB colC
0 1.0 1b 1c
1 -1.0 2b 2c
2 -1.0 3b 3c
3 1.0 4b 4c
4 -1.0 5b 5c, class colB colC
0 -1.0 1b 1c
1 2.0 2b 2c
2 -1.0 3b 3c
3 -1.0 4b 4c
4 2.0 5b 5c, class colB colC
0 -1.0 1b 1c
1 -1.0 2b 2c
2 3.0 3b 3c
3 -1.0 4b 4c
4 -1.0 5b 5c]
来源:https://stackoverflow.com/questions/51914247/changing-multiple-column-values-to-binary-values