Additional metrics in caret - PPV, sensitivity, specificity

后端 未结 1 866
無奈伤痛
無奈伤痛 2020-12-10 07:39

I used caret for logistic regression in R:

  ctrl <- trainControl(method = \"repeatedcv\", number = 10, repeats = 10, 
                       savePredicti         


        
1条回答
  •  [愿得一人]
    2020-12-10 08:39

    Caret already has summary functions to output all the metrics you mention:

    defaultSummary outputs Accuracy and Kappa
    twoClassSummary outputs AUC (area under the ROC curve - see last line of answer), sensitivity and specificity
    prSummary outputs precision and recall

    in order to get combined metrics you can write your own summary function which combines the outputs of these three:

    library(caret)
    MySummary  <- function(data, lev = NULL, model = NULL){
      a1 <- defaultSummary(data, lev, model)
      b1 <- twoClassSummary(data, lev, model)
      c1 <- prSummary(data, lev, model)
      out <- c(a1, b1, c1)
      out}
    

    lets try on the Sonar data set:

    library(mlbench)
    data("Sonar")
    

    when defining the train control it is important to set classProbs = TRUE since some of these metrics (ROC and prAUC) can not be calculated based on predicted class but based on the predicted probabilities.

    ctrl <- trainControl(method = "repeatedcv",
                         number = 10,
                         savePredictions = TRUE,
                         summaryFunction = MySummary,
                         classProbs = TRUE)
    

    Now fit the model of your choice:

    mod_fit <- train(Class ~.,
                     data = Sonar,
                     method = "rf",
                     trControl = ctrl)
    
    mod_fit$results
    #output
      mtry  Accuracy     Kappa       ROC      Sens      Spec       AUC Precision    Recall         F AccuracySD   KappaSD
    1    2 0.8364069 0.6666364 0.9454798 0.9280303 0.7333333 0.8683726 0.8121087 0.9280303 0.8621526 0.10570484 0.2162077
    2   31 0.8179870 0.6307880 0.9208081 0.8840909 0.7411111 0.8450612 0.8074942 0.8840909 0.8374326 0.06076222 0.1221844
    3   60 0.8034632 0.6017979 0.9049242 0.8659091 0.7311111 0.8332068 0.7966889 0.8659091 0.8229330 0.06795824 0.1369086
           ROCSD     SensSD    SpecSD      AUCSD PrecisionSD   RecallSD        FSD
    1 0.04393947 0.05727927 0.1948585 0.03410854  0.12717667 0.05727927 0.08482963
    2 0.04995650 0.11053858 0.1398657 0.04694993  0.09075782 0.11053858 0.05772388
    3 0.04965178 0.12047598 0.1387580 0.04820979  0.08951728 0.12047598 0.06715206
    

    in this output ROC is in fact the area under the ROC curve - usually called AUC
    and AUC is the area under the precision-recall curve across all cutoffs.

    0 讨论(0)
提交回复
热议问题