fit_transform() takes 2 positional arguments but 3 were given with LabelBinarizer

前端 未结 13 1962
时光取名叫无心
时光取名叫无心 2020-12-07 16:35

I am totally new to Machine Learning and I have been working with unsupervised learning technique.

Image shows my sample Data(After all Cleaning) Screenshot : Sample

13条回答
  •  爱一瞬间的悲伤
    2020-12-07 16:56

    To perform one-hot encoding for multiple categorical features, we can create a new class which customizes our own multiple categorical features binarizer and plug it into categorical pipeline as follows.

    Suppose CAT_FEATURES = ['cat_feature1', 'cat_feature2'] is a list of categorical features. The following scripts shall resolve the issue and produce what we want.

    import pandas as pd
    from sklearn.pipeline import Pipeline
    from sklearn.base import BaseEstimator, TransformerMixin
    
    class CustomLabelBinarizer(BaseEstimator, TransformerMixin):
        """Perform one-hot encoding to categorical features."""
        def __init__(self, cat_features):
            self.cat_features = cat_features
    
        def fit(self, X_cat, y=None):
            return self
    
        def transform(self, X_cat):
            X_cat_df = pd.DataFrame(X_cat, columns=self.cat_features)
            X_onehot_df = pd.get_dummies(X_cat_df, columns=self.cat_features)
            return X_onehot_df.values
    
    # Pipeline for categorical features.
    cat_pipeline = Pipeline([
        ('selector', DataFrameSelector(CAT_FEATURES)),
        ('onehot_encoder', CustomLabelBinarizer(CAT_FEATURES))
    ])
    

提交回复
热议问题