问题
df = pd.DataFrame(["c", "b", "a p", NaN, "ap"])
df[0].str.get_dummies(' ')
The above code prints something like this.
a p b c ap
0 0 0 0 1 0
1 0 0 1 0 0
2 1 1 0 0 0
3 0 0 0 0 0
4 0 0 0 0 1
The required output is the following:
a p b c
0 0 0 0 1
1 0 0 1 0
2 1 1 0 0
3 0 0 0 0
4 1 1 0 0
I am sure it's bit tricky. Any help is appreciated.
回答1:
You can use str.get_dummies
df[0].str.get_dummies(' ')
air bus car plane
0 0 0 1 0
1 0 1 0 0
2 1 0 0 1
回答2:
IIUC str.get_dummies
df[0].str.get_dummies(sep=' ')
Out[745]:
air bus car plane
0 0 0 1 0
1 0 1 0 0
2 1 0 0 1
Or
pd.get_dummies(pd.DataFrame(df[0].str.split().tolist()).stack()).sum(level=0)
Out[754]:
air bus car plane
0 0 0 1 0
1 0 1 0 0
2 1 0 0 1
来源:https://stackoverflow.com/questions/49420338/pandas-get-dummies-to-create-one-hot-with-separator-and-with-character-lev