Should the custom loss function in Keras return a single loss value for the batch or an arrary of losses for every sample in the training batch?

后端 未结 6 1471
南笙
南笙 2020-12-20 17:30

I\'m learning keras API in tensorflow(2.3). In this guide on tensorflow website, I found an example of custom loss funciton:

    def custom_mean_squared_error         


        
6条回答
  •  鱼传尺愫
    2020-12-20 18:09

    Actually, as far as I know, the shape of return value of the loss function is not important, i.e. it could be a scalar tensor or a tensor of one or multiple values per sample. The important thing is how it should be reduced to a scalar value so that it could be used in optimization process or shown to the user. For that, you can check the reduction types in Reduction documentation.

    Further, here is what the compile method documentation says about the loss argument, partially addressing this point:

    loss: String (name of objective function), objective function or tf.keras.losses.Loss instance. See tf.keras.losses. An objective function is any callable with the signature loss = fn(y_true,y_pred), where y_true = ground truth values with shape = [batch_size, d0, .. dN], except sparse loss functions such as sparse categorical crossentropy where shape = [batch_size, d0, .. dN-1]. y_pred = predicted values with shape = [batch_size, d0, .. dN]. It returns a weighted loss float tensor. If a custom Loss instance is used and reduction is set to NONE, return value has the shape [batch_size, d0, .. dN-1] ie. per-sample or per-timestep loss values; otherwise, it is a scalar. If the model has multiple outputs, you can use a different loss on each output by passing a dictionary or a list of losses. The loss value that will be minimized by the model will then be the sum of all individual losses.

    In addition, it's worth noting that most of the built-in loss functions in TF/Keras are usually reduced over the last dimension (i.e. axis=-1).


    For those who doubt that a custom loss function which returns a scalar value would work: you can run the following snippet and you will see that the model would train and converge properly.

    import tensorflow as tf
    import numpy as np
    
    def custom_loss(y_true, y_pred):
        return tf.reduce_sum(tf.square(y_true - y_pred))
    
    inp = tf.keras.layers.Input(shape=(3,))
    out = tf.keras.layers.Dense(3)(inp)
    
    model = tf.keras.Model(inp, out)
    model.compile(loss=custom_loss, optimizer=tf.keras.optimizers.Adam(lr=0.1))
    
    x = np.random.rand(1000, 3)
    y = x * 10 + 2.5
    model.fit(x, y, epochs=20)
    

提交回复
热议问题