Custom transformer for sklearn Pipeline that alters both X and y

前端 未结 3 2029
忘掉有多难
忘掉有多难 2020-12-15 06:25

I want to create my own transformer for use with the sklearn Pipeline. Hence I am creating a class that implements both fit and transform methods. The purpose of the transfo

3条回答
  •  伪装坚强ぢ
    2020-12-15 07:22

    You can solve this easily by using the sklearn.preprocessing.FunctionTransformer method (http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.FunctionTransformer.html)

    You just need to put your alternations to X in a function

    def drop_nans(X, y=None):
        total = X.shape[1]                                           
        new_thresh = total - thresh
        df = pd.DataFrame(X)
        df.dropna(thresh=new_thresh, inplace=True)
        return df.values
    

    then you get your transformer by calling

    transformer = FunctionTransformer(drop_nans, validate=False)
    

    which you can use in the pipeline. The threshold can be set outside the drop_nans function.

提交回复
热议问题