how to assess the confidence score of a prediction with scikit-learn

青春壹個敷衍的年華 提交于 2021-02-05 06:15:41

问题


I have write down a simple code that takes One arguments "query_seq", further methods calculates descriptor and in the end predictions can be made using "LogisticRegression" (or any other algorithm provided with the function) algorithms as "0 (negative for given case)" or "1 (positive for given case)"

def main_process(query_Seq):
    LR = LogisticRegression()
    GNB = GaussianNB()
    KNB = KNeighborsClassifier()
    DT = DecisionTreeClassifier()
    SV = SVC(probability=True)

    train_x, train_y,train_l = data_gen(p) 
    a  = DC_CLASS()
    test_x = a.main_p(query_Seq)
    return Prediction(train_x, train_y, test_x,LR)

While we performed cross validation we have calculated the different statistical parameters for the accuracy estimation (specificity, sensitivity, mmc, etc. ) for an algorithm. Now my Question is that, is there any method in scikit-learn through which we can estimate the confidence score for a test data prediction.


回答1:


Many classifiers can give you a hint of their own confidence level for a given prediction by calling the predict_proba instead of the predict method. Read the docstring of this method to understand the content of the numpy array it returns.

Note however that classifiers can also make mistakes in estimating their own confidence level. To fix this you can use an external calibration procedure to calibrate the classifier via held out data (using a cross-validation loop). The documentation will give you more details on calibration:

http://scikit-learn.org/stable/modules/calibration.html

Finally note that LogisticRegression gives reasonably well calibrated confidence levels by default. Most other model class to benefit from external calibration.



来源:https://stackoverflow.com/questions/36643344/how-to-assess-the-confidence-score-of-a-prediction-with-scikit-learn

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!