I\'m doing a multiclass text classification in Scikit-Learn. The dataset is being trained using the Multinomial Naive Bayes classifier having hundreds of labels. Here\'s an extr
While the previous answers are probably all working I found them a bit verbose. The following stores the individual class results as well as the summary line in a single dataframe. Not very sensitive to changes in the report but did the trick for me.
#init snippet and fake data
from io import StringIO
import re
import pandas as pd
from sklearn import metrics
true_label = [1,1,2,2,3,3]
pred_label = [1,2,2,3,3,1]
def report_to_df(report):
report = re.sub(r" +", " ", report).replace("avg / total", "avg/total").replace("\n ", "\n")
report_df = pd.read_csv(StringIO("Classes" + report), sep=' ', index_col=0)
return(report_df)
#txt report to df
report = metrics.classification_report(true_label, pred_label)
report_df = report_to_df(report)
#store, print, copy...
print (report_df)
Which gives the desired output:
Classes precision recall f1-score support
1 0.5 0.5 0.5 2
2 0.5 0.5 0.5 2
3 0.5 0.5 0.5 2
avg/total 0.5 0.5 0.5 6