I wrote a confusion matrix calculation code in Python:
def conf_mat(prob_arr, input_arr):
# confusion matrix
conf_arr = [[0, 0], [0, 0]]
If you don't want scikit-learn to do the work for you...
import numpy
actual = numpy.array(actual)
predicted = numpy.array(predicted)
# calculate the confusion matrix; labels is numpy array of classification labels
cm = numpy.zeros((len(labels), len(labels)))
for a, p in zip(actual, predicted):
cm[a][p] += 1
# also get the accuracy easily with numpy
accuracy = (actual == predicted).sum() / float(len(actual))
Or take a look at a more complete implementation here in NLTK.
A numpy-only solution for any number of classes that doesn't require looping:
import numpy as np
classes = 3
true = np.random.randint(0, classes, 50)
pred = np.random.randint(0, classes, 50)
np.bincount(true * classes + pred).reshape((classes, classes))
Only with numpy, we can do as follow considering efficiency:
def confusion_matrix(pred, label, nc=None):
assert pred.size == label.size
if nc is None:
nc = len(unique(label))
logging.debug("Number of classes assumed to be {}".format(nc))
confusion = np.zeros([nc, nc])
# avoid the confusion with `0`
tran_pred = pred + 1
for i in xrange(nc): # current class
mask = (label == i)
masked_pred = mask * tran_pred
cls, counts = unique(masked_pred, return_counts=True)
# discard the first item
cls = [cl - 1 for cl in cls][1:]
counts = counts[1:]
for cl, count in zip(cls, counts):
confusion[i, cl] = count
return confusion
For other features such as plot, mean-IoU, see my repositories.
Nearly a decade has passed, yet the solutions (without sklearn) to this post are convoluted and unnecessarily long. Computing a confusion matrix can be done cleanly in Python in a few lines. For example:
import numpy as np
def compute_confusion_matrix(true, pred):
'''Computes a confusion matrix using numpy for two np.arrays
true and pred.
Results are identical (and similar in computation time) to:
"from sklearn.metrics import confusion_matrix"
However, this function avoids the dependency on sklearn.'''
K = len(np.unique(true)) # Number of classes
result = np.zeros((K, K))
for i in range(len(true)):
result[true[i]][pred[i]] += 1
return result
Scikit-learn (which I recommend using anyways) has it included in the metrics
module:
>>> from sklearn.metrics import confusion_matrix
>>> y_true = [0, 1, 2, 0, 1, 2, 0, 1, 2]
>>> y_pred = [0, 0, 0, 0, 1, 1, 0, 2, 2]
>>> confusion_matrix(y_true, y_pred)
array([[3, 0, 0],
[1, 1, 1],
[1, 1, 1]])
This function creates confusion matrices for any number of classes.
def create_conf_matrix(expected, predicted, n_classes):
m = [[0] * n_classes for i in range(n_classes)]
for pred, exp in zip(predicted, expected):
m[pred][exp] += 1
return m
def calc_accuracy(conf_matrix):
t = sum(sum(l) for l in conf_matrix)
return sum(conf_matrix[i][i] for i in range(len(conf_matrix))) / t
In contrast to your function above, you have to extract the predicted classes before calling the function, based on your classification results, i.e. sth. like
[1 if p < .5 else 2 for p in classifications]