I am trying to run some Machine learning algo on a dataset using scikit-learn. My dataset has some features which are like categories. Like one feature is A, wh
You should use the OneHotEncoder transformer with the categorical variables, and leave the ordinal variable untouched:
>>> import pandas as pd
>>> from sklearn.preprocessing import OneHotEncoder
>>> df = pd.DataFrame({'quality': [1, 2, 3], 'city': [3, 2, 1], columns=['quality', 'city']}
>>> enc = OneHotEncoder(categorical_features=[False, True])
>>> X = df.values
>>> enc.fit(X)
>>> enc.transform(X).todense()
matrix([[ 0., 0., 1., 1.],
[ 0., 1., 0., 2.],
[ 1., 0., 0., 3.]])