Using scikit to determine contributions of each feature to a specific class prediction

后端 未结 5 1149
粉色の甜心
粉色の甜心 2020-12-02 13:57

I am using a scikit extra trees classifier:

model = ExtraTreesClassifier(n_estimators=10000, n_jobs=-1, random_state=0)

Once the model is f

5条回答
  •  感动是毒
    2020-12-02 14:15

    The paper "Why Should I Trust You?": Explaining the Predictions of Any Classifier was submitted 9 days after this question, providing an algorithm for a general solution to this problem! :-)

    In short, it is called LIME for "local interpretable model-agnostic explanations", and works by fitting a simpler, local model around the prediction(s) you want to understand.

    What's more, they have made a python implementation (https://github.com/marcotcr/lime) with pretty detailed examples on how to use it with sklearn. For instance this one is on two-class random forest problem on text data, and this one is on continuous and categorical features. They are all to be found via the README on github.

    The authors had a very productive year in 2016 concerning this field, so if you like reading papers, here's a starter:

    • Programs as Black-Box Explanations
    • Nothing Else Matters: Model-Agnostic Explanations By Identifying Prediction Invariance
    • Model-Agnostic Interpretability of Machine Learning

提交回复
热议问题