Given a vector of scores and a vector of actual class labels, how do you calculate a single-number AUC metric for a binary classifier in the R language or in simple English?
Combining code from ISL 9.6.3 ROC Curves, along with @J. Won.'s answer to this question and a few more places, the following plots the ROC curve and prints the AUC in the bottom right on the plot.
Below probs
is a numeric vector of predicted probabilities for binary classification and test$label
contains the true labels of the test data.
require(ROCR)
require(pROC)
rocplot <- function(pred, truth, ...) {
predob = prediction(pred, truth)
perf = performance(predob, "tpr", "fpr")
plot(perf, ...)
area <- auc(truth, pred)
area <- format(round(area, 4), nsmall = 4)
text(x=0.8, y=0.1, labels = paste("AUC =", area))
# the reference x=y line
segments(x0=0, y0=0, x1=1, y1=1, col="gray", lty=2)
}
rocplot(probs, test$label, col="blue")
This gives a plot like this: