Keras custom RMSLE metric

问题

How do I implement this metric in Keras? My code below gives the wrong result! Note that I'm undoing a previous log(x + 1) transformation via exp(x) - 1, also negative predictions are clipped to 0:

def rmsle_cust(y_true, y_pred):
    first_log = K.clip(K.exp(y_pred) - 1.0, 0, None)
    second_log = K.clip(K.exp(y_true) - 1.0, 0, None)
    return K.sqrt(K.mean(K.square(K.log(first_log + 1.) - K.log(second_log + 1.)), axis=-1)

For comparison, here's the standard numpy implementation:

def rmsle_cust_py(y, y_pred, **kwargs):
    # undo 1 + log
    y = np.exp(y) - 1
    y_pred = np.exp(y_pred) - 1

    y_pred[y_pred < 0] = 0.0
    to_sum = [(math.log(y_pred[i] + 1) - math.log(y[i] + 1)) ** 2.0 for i,pred in enumerate(y_pred)]
    return (sum(to_sum) * (1.0/len(y))) ** 0.5

What I'm doing wrong? Thanks!

EDIT: Setting axis=0 seems to give a value very close to the correct one, but I'm not sure since all the code I've seem uses axis=-1.

回答1:

I ran into the same problem and searched for it, here is what I found

https://www.kaggle.com/jpopham91/rmlse-vectorized

After modified a bit, this seems to work for me,rmsle_K method implemented with Keras and TensorFlow.

import numpy as np
import math
from keras import backend as K
import tensorflow as tf

def rmsle(y, y0):
    assert len(y) == len(y0)
    return np.sqrt(np.mean(np.power(np.log1p(y)-np.log1p(y0), 2)))

def rmsle_loop(y, y0):
    assert len(y) == len(y0)
    terms_to_sum = [(math.log(y0[i] + 1) - math.log(y[i] + 1)) ** 2.0 for i,pred in enumerate(y0)]
    return (sum(terms_to_sum) * (1.0/len(y))) ** 0.5

def rmsle_K(y, y0):
    return K.sqrt(K.mean(K.square(tf.log1p(y) - tf.log1p(y0))))

r = rmsle(y=[5, 20, 12], y0=[8, 16, 12])
r1 = rmsle_loop(y=[5, 20, 12], y0=[8, 16, 12])
r2 = rmsle_K(y=[5., 20., 12.], y0=[8., 16., 12.])

print(r)

print(r1)

sess = tf.Session()

print(sess.run(r2))

Result:

Using TensorFlow backend

0.263978210565

0.263978210565

0.263978

回答2:

By the use of a list (to_sum) in the numpy implementation, I suspect your numpy array has shape (length,).

And on Keras, since you've got different results with axis=0 and axis=1, you probably got some shape like (length,1).

Also, when creating the to_sum list, you're using y[i] and y_pred[i], which means you're taking elements from the axis=0 in numpy implementation.

The numpy implementation also sums everything for calculating the mean in sum(to_sum). So, you really don't need to use any axis in the K.mean.

If you make sure your model's output shape is either (length,) or (length,1), you can use just K.mean(value) without passing the axis parameter.

来源：https://stackoverflow.com/questions/47582982/keras-custom-rmsle-metric

标签

python

deep-learning

keras

metrics