I\'m trying to use scikit-learn\'s LabelEncoder to encode a pandas DataFrame of string labels. As the dataframe has many (50+) columns, I want to a
if we have single column to do the label encoding and its inverse transform its easy how to do it when there are multiple columns in python
def stringtocategory(dataset):
'''
@author puja.sharma
@see The function label encodes the object type columns and gives label encoded and inverse tranform of the label encoded data
@param dataset dataframe on whoes column the label encoding has to be done
@return label encoded and inverse tranform of the label encoded data.
'''
data_original = dataset[:]
data_tranformed = dataset[:]
for y in dataset.columns:
#check the dtype of the column object type contains strings or chars
if (dataset[y].dtype == object):
print("The string type features are : " + y)
le = preprocessing.LabelEncoder()
le.fit(dataset[y].unique())
#label encoded data
data_tranformed[y] = le.transform(dataset[y])
#inverse label transform data
data_original[y] = le.inverse_transform(data_tranformed[y])
return data_tranformed,data_original