How to compute Receiving Operating Characteristic (ROC) and AUC in keras?

问题

I have a multi output(200) binary classification model which I wrote in keras.

In this model I want to add additional metrics such as ROC and AUC but to my knowledge keras dosen\'t have in-built ROC and AUC metric functions.

I tried to import ROC, AUC functions from scikit-learn

from sklearn.metrics import roc_curve, auc
from keras.models import Sequential
from keras.layers import Dense
.
.
.
model.add(Dense(200, activation=\'relu\'))
model.add(Dense(300, activation=\'relu\'))
model.add(Dense(400, activation=\'relu\'))
model.add(Dense(300, activation=\'relu\'))
model.add(Dense(200,init=\'normal\', activation=\'softmax\')) #outputlayer

model.compile(loss=\'categorical_crossentropy\', optimizer=\'adam\',metrics=[\'accuracy\',\'roc_curve\',\'auc\'])

but it\'s giving this error:

Exception: Invalid metric: roc_curve

How should I add ROC, AUC to keras?

回答1:

Due to that you can't calculate ROC&AUC by mini-batches, you can only calculate it on the end of one epoch. There is a solution from jamartinh, I patch the codes below for convenience:

from sklearn.metrics import roc_auc_score
from keras.callbacks import Callback
class roc_callback(Callback):
    def __init__(self,training_data,validation_data):
        self.x = training_data[0]
        self.y = training_data[1]
        self.x_val = validation_data[0]
        self.y_val = validation_data[1]


    def on_train_begin(self, logs={}):
        return

    def on_train_end(self, logs={}):
        return

    def on_epoch_begin(self, epoch, logs={}):
        return

    def on_epoch_end(self, epoch, logs={}):
        y_pred = self.model.predict(self.x)
        roc = roc_auc_score(self.y, y_pred)
        y_pred_val = self.model.predict(self.x_val)
        roc_val = roc_auc_score(self.y_val, y_pred_val)
        print('\rroc-auc: %s - roc-auc_val: %s' % (str(round(roc,4)),str(round(roc_val,4))),end=100*' '+'\n')
        return

    def on_batch_begin(self, batch, logs={}):
        return

    def on_batch_end(self, batch, logs={}):
        return

model.fit(X_train, y_train, validation_data=(X_test, y_test), callbacks=[roc_callback(training_data=(X_train, y_train),validation_data=(X_test, y_test))])

A more hackable way using tf.contrib.metrics.streaming_auc:

import numpy as np
import tensorflow as tf
from sklearn.metrics import roc_auc_score
from sklearn.datasets import make_classification
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils
from keras.callbacks import Callback, EarlyStopping


# define roc_callback, inspired by https://github.com/keras-team/keras/issues/6050#issuecomment-329996505
def auc_roc(y_true, y_pred):
    # any tensorflow metric
    value, update_op = tf.contrib.metrics.streaming_auc(y_pred, y_true)

    # find all variables created for this metric
    metric_vars = [i for i in tf.local_variables() if 'auc_roc' in i.name.split('/')[1]]

    # Add metric variables to GLOBAL_VARIABLES collection.
    # They will be initialized for new session.
    for v in metric_vars:
        tf.add_to_collection(tf.GraphKeys.GLOBAL_VARIABLES, v)

    # force to update metric values
    with tf.control_dependencies([update_op]):
        value = tf.identity(value)
        return value

# generation a small dataset
N_all = 10000
N_tr = int(0.7 * N_all)
N_te = N_all - N_tr
X, y = make_classification(n_samples=N_all, n_features=20, n_classes=2)
y = np_utils.to_categorical(y, num_classes=2)

X_train, X_valid = X[:N_tr, :], X[N_tr:, :]
y_train, y_valid = y[:N_tr, :], y[N_tr:, :]

# model & train
model = Sequential()
model.add(Dense(2, activation="softmax", input_shape=(X.shape[1],)))

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy', auc_roc])

my_callbacks = [EarlyStopping(monitor='auc_roc', patience=300, verbose=1, mode='max')]

model.fit(X, y,
          validation_split=0.3,
          shuffle=True,
          batch_size=32, nb_epoch=5, verbose=1,
          callbacks=my_callbacks)

# # or use independent valid set
# model.fit(X_train, y_train,
#           validation_data=(X_valid, y_valid),
#           batch_size=32, nb_epoch=5, verbose=1,
#           callbacks=my_callbacks)

回答2:

Like you, I prefer using scikit-learn's built in methods to evaluate AUROC. I find that the best and easiest way to do this in keras is to create a custom metric. If tensorflow is your backend, implementing this can be done in very few lines of code:

import tensorflow as tf
from sklearn.metrics import roc_auc_score

def auroc(y_true, y_pred):
    return tf.py_func(roc_auc_score, (y_true, y_pred), tf.double)

# Build Model...

model.compile(loss='categorical_crossentropy', optimizer='adam',metrics=['accuracy', auroc])

Creating a custom Callback as mentioned in other answers will not work for your case since your model has multiple ouputs, but this will work. Additionally, this methods allows the metric to be evaluated on both training and validation data whereas a keras callback does not have access to the training data and can thus only be used to evaluate performance on the training data.

回答3:

The following solution worked for me:

import tensorflow as tf
from keras import backend as K

def auc(y_true, y_pred):
    auc = tf.metrics.auc(y_true, y_pred)[1]
    K.get_session().run(tf.local_variables_initializer())
    return auc

model.compile(loss="binary_crossentropy", optimizer='adam', metrics=[auc])

回答4:

I solved my problem this way

consider you have testing dataset x_test for features and y_test for its corresponding targets.

first we predict targets from feature using our trained model

 y_pred = model.predict_proba(x_test)

then from sklearn we import roc_auc_score function and then simple pass the original targets and predicted targets to the function.

 roc_auc_score(y_test, y_pred)

回答5:

'roc_curve','auc' are not standard metrics you can't pass them like that to metrics variable, this is not allowed. You can pass something like 'fmeasure' which is a standard metric.

Review the available metrics here: https://keras.io/metrics/ You may also want to have a look at making your own custom metric: https://keras.io/metrics/#custom-metrics

Also have a look at generate_results method mentioned in this blog for ROC, AUC... https://vkolachalama.blogspot.in/2016/05/keras-implementation-of-mlp-neural.html

回答6:

Adding to above answers, I got the error "ValueError: bad input shape ...", so I specify the vector of probabilities as follows:

y_pred = model.predict_proba(x_test)[:,1]
auc = roc_auc_score(y_test, y_pred)
print(auc)

来源：https://stackoverflow.com/questions/41032551/how-to-compute-receiving-operating-characteristic-roc-and-auc-in-keras

标签

python

theano

keras