问题
I am using a multiple output model in keras
model1 = Model(input=x, output=[y2,y3])
model1.compile((optimizer='sgd', loss=cutom_loss_function)
my custom_loss_function
is;
def custom_loss(y_true, y_pred):
y2_pred = y_pred[0]
y2_true = y_true[0]
loss = K.mean(K.square(y2_true - y2_pred), axis=-1)
return loss
I only want to train the network on output y2
.
What is the shape/structure of the y_pred
and y_true
argument in loss function when multiple outputs are used?
Can I access them as above? Is it y_pred[0]
or y_pred[:,0]
?
回答1:
I only want to train the network on output y2.
Based on Keras functional API guide you can achieve that with
model1 = Model(input=x, output=[y2,y3])
model1.compile(optimizer='sgd', loss=custom_loss_function,
loss_weights=[1., 0.0])
What is the shape/structure of the y_pred and y_true argument in loss function when multiple outputs are used? Can I access them as above? Is it y_pred[0] or y_pred[:,0]
In keras multi-output models loss function is applied for each output separately. In pseudo-code:
loss = sum( [ loss_function( output_true, output_pred ) for ( output_true, output_pred ) in zip( outputs_data, outputs_model ) ] )
The functionality to do loss function on multiple outputs seems unavailable to me. One probably could achieve that by incorporating the loss function as a layer of the network.
回答2:
Sharapolas' answer is right.
However, there is a better way than using a layer for building custom loss functions with complex interdependence of several outputs of a model.
The method I know is being used in practice is by never calling model.compile
but only model._make_predict_function()
. From there on, you can go on and build a custom optimizer method by calling model.output
in there. This will give you all outputs, [y2,y3] in your case. When doing your magic with it, get a keras.optimizer
and use it's get_update method using your model.trainable_weights and your loss. Finally, return a keras.function
with a list of the inputs required (in your case only model.input
) and the updates you just got from the optimizer.get_update call. This function now replaces model.fit.
The above is often used in PolicyGradient algorithms, like A3C or PPO. Here is an example of what I tried to explain: https://github.com/Hyeokreal/Actor-Critic-Continuous-Keras/blob/master/a2c_continuous.py Look at build_model and critic_optimizer methods and read kreas.backend.function documentation to understand what happens.
I found this way to have frequently problems with the session management and does not appear to work in tf-2.0 keras at all currently. Hence, if anyone knows a method, please let me know. I came here looking for one :)
来源:https://stackoverflow.com/questions/44172165/keras-multiple-output-custom-loss-function