Best way to combine probabilistic classifiers in scikit-learn

前端 未结 4 844
误落风尘
误落风尘 2021-01-30 11:46

I have a logistic regression and a random forest and I\'d like to combine them (ensemble) for the final classification probability calculation by taking an average.

Is t

4条回答
  •  南方客
    南方客 (楼主)
    2021-01-30 12:01

    Now scikit-learn has StackingClassifier which can be used to stack multiple estimators.

    from sklearn.datasets import load_iris  
    from sklearn.ensemble import RandomForestClassifier
    from sklearn.svm import LinearSVC
    from sklearn.linear_model import LogisticRegression
    from sklearn.preprocessing import StandardScaler
    from sklearn.pipeline import make_pipeline
    from sklearn.ensemble import StackingClassifier
    X, y = load_iris(return_X_y=True)
    estimators = [
        ('rf', RandomForestClassifier(n_estimators=10, random_state=42)),
        ('lg', LogisticRegression()))
       ]
    clf = StackingClassifier(
    estimators=estimators, final_estimator=LogisticRegression()
    )
    from sklearn.model_selection import train_test_split
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, stratify=y, random_state=42
    )
    clf.fit(X_train, y_train)
    clf.predict_proba(X_test)
    

提交回复
热议问题