How to get the most contributing feature in any classifier Sklearn for example DecisionTreeClassifier knn etc

江枫思渺然 提交于 2019-12-09 07:05:06

问题


I have tried my model on a data set using KNN classifier , I would like to know which is the most contributing feature in the model, and most contributing feature in the prediction.


回答1:


To gain qualitative insight into which feature has greater impact on classification you could perform n_feats classifications using one single feature at a time (n_feats stands for the feature vector dimension), like this:

import numpy as np
from sklearn import datasets
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import cross_val_score

iris = datasets.load_iris()

clf = KNeighborsClassifier()

y =  iris.target
n_feats = iris.data.shape[1]

print('Feature  Accuracy')
for i in range(n_feats):
    X = iris.data[:, i].reshape(-1, 1)
    scores = cross_val_score(clf, X, y)
    print('%d        %g' % (i, scores.mean()))

Output:

Feature  Accuracy
0        0.692402
1        0.518382
2        0.95384
3        0.95384

These results suggest that classification is dominated by features 2 and 3.

You could follow an alternative approach by replacing X = iris.data[:, i].reshape(-1, 1) in the code above by:

    X_head = np.atleast_2d(iris.data[:, 0:i])
    X_tail = np.atleast_2d(iris.data[:, i+1:])
    X = np.hstack((X_head, X_tail))

In this case you are performing n_samplesclassifications as well. The difference is that the feature vector used in the i-th classification is made up of all the features but the i-th.

Sample run:

Feature  Accuracy
0        0.973856
1        0.96732
2        0.946895
3        0.959967

It clearly emerges from these results that the classifier yields the worst accuracy when you get rid of the second feature, which is consistent with the results obtained through the first approach.



来源:https://stackoverflow.com/questions/42088336/how-to-get-the-most-contributing-feature-in-any-classifier-sklearn-for-example-d

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!