ROC function error “Predictor must be numeric or ordered.”

你说的曾经没有我的故事 提交于 2021-02-08 03:41:51

问题


I am not able to get ROC function to work, I get the error "Predictor must be numeric or ordered".

I've looked through other posts, but nothing solves my problem. Any help is highly appreciated.

"Get data"
flying=dget("https://www.math.ntnu.no/emner/TMA4268/2019v/data/flying.dd")
ctrain=flying$ctrain
ctest=flying$ctest


library(MASS)
fly_qda=qda(diabetes~., data=ctrain)


#Test error is given below:
predict_qda=predict(fly_qda, newdata=ctest, probability=TRUE)
table_qda<-table(ctest$diabetes, predict_qda$class)
error_qda<-1-sum(diag(table_qda))/sum(table_qda)
error_qda

"ROC curve and AUC"
predict_qdatrain<-predict(fly_qda, newdata=ctrain)
roc_qda=roc(response=ctrain$diabetes, predictor= predict_qdatrain$class, plot=TRUE)
plot(roc_qda, col="red", lwd=3, main="ROC curve QDA")
auc_qda<-auc(roc_qda)

I want the plotted ROC curve and AUC


回答1:


As Ollie Perkins explained in his answer, the error you are getting indicates that your are passing something that is not of sortable nature and therefore cannot be used for ROC analysis. In the case of the predict.qda, the class item is a factor with 1s and 0s indicating the class.

Instead of converting the class to an ordered predictor, it is a better idea to use the posterior probabilities instead. Let's use the probability to belong to class 1:

roc_qda <- roc(response = ctrain$diabetes, predictor = predict_qdatrain$posterior[,"1"])
plot(roc_qda, col="red", lwd=3, main="ROC curve QDA")
auc(roc_qda)

This will give you a smoother curve and more classification thresholds to choose from.




回答2:


So assuming you are using the pROC package, I have fixed this below. The error message means that the predictor variable has to either be of type numeric (a floating point integer) or an ordered factor (a categorical variable where the order of levels matters). Therefore, in order to calculate the ROC curve from your predict object, I have converted it on the fly below.

Secondly, in your original code, you were predicting onto the original training set. I have changed this to the test data below.

"Get data"

flying=dget("https://www.math.ntnu.no/emner/TMA4268/2019v/data/flying.dd")
ctrain=flying$ctrain
ctest=flying$ctest


library(MASS)
library(pROC)
fly_qda=qda(diabetes~., data=ctrain)


#Test error is given below:
predict_qda=predict(fly_qda, newdata=ctest, probability=TRUE)
table_qda<-table(ctest$diabetes, predict_qda$class)
error_qda<-1-sum(diag(table_qda))/sum(table_qda)
error_qda

"ROC curve and AUC"
predict_qdatrain<-predict(fly_qda, newdata=ctrain)
roc_qda=roc(response=ctrain$diabetes, predictor= factor(predict_qdatrain$class, 
ordered = TRUE), plot=TRUE)
plot(roc_qda, col="red", lwd=3, main="ROC curve QDA")
auc_qda<-auc(roc_qda)


来源:https://stackoverflow.com/questions/55760669/roc-function-error-predictor-must-be-numeric-or-ordered

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!