Label encoding across multiple columns in scikit-learn

后端 未结 22 2466
礼貌的吻别
礼貌的吻别 2020-11-22 09:02

I\'m trying to use scikit-learn\'s LabelEncoder to encode a pandas DataFrame of string labels. As the dataframe has many (50+) columns, I want to a

22条回答
  •  青春惊慌失措
    2020-11-22 09:38

    if we have single column to do the label encoding and its inverse transform its easy how to do it when there are multiple columns in python

    def stringtocategory(dataset):
        '''
        @author puja.sharma
        @see The function label encodes the object type columns and gives label      encoded and inverse tranform of the label encoded data
        @param dataset dataframe on whoes column the label encoding has to be done
        @return label encoded and inverse tranform of the label encoded data.
       ''' 
       data_original = dataset[:]
       data_tranformed = dataset[:]
       for y in dataset.columns:
           #check the dtype of the column object type contains strings or chars
           if (dataset[y].dtype == object):
              print("The string type features are  : " + y)
              le = preprocessing.LabelEncoder()
              le.fit(dataset[y].unique())
              #label encoded data
              data_tranformed[y] = le.transform(dataset[y])
              #inverse label transform  data
              data_original[y] = le.inverse_transform(data_tranformed[y])
       return data_tranformed,data_original
    

提交回复
热议问题