How to specify a variable in pandas as ordinal/categorical?

后端未结

关注

 3  921

攒了一身酷 2020-12-24 15:36

I am trying to run some Machine learning algo on a dataset using scikit-learn. My dataset has some features which are like categories. Like one feature is A, wh

3条回答

心在旅途 (楼主)

2020-12-24 16:09

You should use the OneHotEncoder transformer with the categorical variables, and leave the ordinal variable untouched:

>>> import pandas as pd
>>> from sklearn.preprocessing import OneHotEncoder
>>> df = pd.DataFrame({'quality': [1, 2, 3], 'city': [3, 2, 1], columns=['quality', 'city']}
>>> enc = OneHotEncoder(categorical_features=[False, True])
>>> X = df.values
>>> enc.fit(X)
>>> enc.transform(X).todense()
matrix([[ 0.,  0.,  1.,  1.],
        [ 0.,  1.,  0.,  2.],
        [ 1.,  0.,  0.,  3.]])

0 讨论(0)

查看其它3个回答