How to get ROC curve for decision tree?

匿名 (未验证) 提交于 2019-12-03 07:50:05

问题:

I am trying to find ROC curve and AUROC curve for decision tree. My code was something like

clf.fit(x,y) y_score = clf.fit(x,y).decision_function(test[col]) pred = clf.predict_proba(test[col]) print(sklearn.metrics.roc_auc_score(actual,y_score)) fpr,tpr,thre = sklearn.metrics.roc_curve(actual,y_score) 

output:

 Error() 'DecisionTreeClassifier' object has no attribute 'decision_function' 

basically, the error is coming up while finding the y_score. Please explain what is y_score and how to solve this problem?

回答1:

First of all, the DecisionTreeClassifier has no attribute decision_function.

If I guess from the structure of your code , you saw this example

In this case the classifier is not the decision tree but it is the OneVsRestClassifier that supports the decision_function method.

You can see the available attributes of DecisionTreeClassifier here

A possible way to do it is to binarize the classes and then compute the auc for each class:

Example:

from sklearn.metrics import roc_curve, auc from sklearn.model_selection import train_test_split from sklearn.preprocessing import label_binarize from sklearn.tree import DecisionTreeClassifier from scipy import interp   iris = datasets.load_iris() X = iris.data y = iris.target  y = label_binarize(y, classes=[0, 1, 2]) n_classes = y.shape[1]  X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.5, random_state=0)  classifier = DecisionTreeClassifier()  y_score = classifier.fit(X_train, y_train).predict(X_test)  fpr = dict() tpr = dict() roc_auc = dict() for i in range(n_classes):     fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_score[:, i])     roc_auc[i] = auc(fpr[i], tpr[i])  # Compute micro-average ROC curve and ROC area fpr["micro"], tpr["micro"], _ = roc_curve(y_test.ravel(), y_score.ravel()) roc_auc["micro"] = auc(fpr["micro"], tpr["micro"])  #ROC curve for a specific class here for the class 2 roc_auc[2] 

Result

0.94852941176470573 


回答2:

Think that for a decision tree you can use .predict_proba() instead of .decision_function() so you will get something as below:

y_score = classifier.fit(X_train, y_train).predict_proba(X_test) 

Then, the rest of the code will be the same. In fact, the roc_curve function from scikit learn can take two types of input: "Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by “decision_function” on some classifiers)." See here for more details.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!