Keras - how to get unnormalized logits instead of probabilities

问题

I am creating a model in Keras and want to compute my own metric (perplexity). This requires using the unnormalized probabilities/logits. However, the keras model only returns the softmax probabilties:

model = Sequential()
model.add(embedding_layer)
model.add(LSTM(n_hidden, return_sequences=False))
model.add(Dropout(dropout_keep_prob))
model.add(Dense(vocab_size))
model.add(Activation('softmax'))
optimizer = RMSprop(lr=self.lr)

model.compile(optimizer=optimizer, 
loss='sparse_categorical_crossentropy')

The Keras FAQ have a solution to get the output of intermediate layers here. Another solution is given here. However, these answers store the intermediate outputs in a different model which is not what I need. I want to use the logits for my custom metric. The custom metric should be included in the model.compile() function such that it's evaluated and displayed during training. So I don't need the output of the Dense layer separated in a different model, but as part of my original model.

In short, my questions are:

When defining a custom metric as outlined here using def custom_metric(y_true, y_pred), does the y_pred contain logits or normalized probabilities?
If it contains normalized probabilities, how can I get the unnormalized probabilities, i.e. the logits output by the Dense layer?

回答1:

I think I have found a solution

First, I change the activation layer to linear such that I receive logits as outlined by @loannis Nasios.

Second, to still get the sparse_categorical_crossentropy as a loss function, I define my own loss function, setting the from_logits parameter to true.

model.add(embedding_layer)
model.add(LSTM(n_hidden, return_sequences=False))
model.add(Dropout(dropout_keep_prob))
model.add(Dense(vocab_size))
model.add(Activation('linear'))
optimizer = RMSprop(lr=self.lr)


def my_sparse_categorical_crossentropy(y_true, y_pred):
    return K.sparse_categorical_crossentropy(y_true, y_pred, from_logits=True)

model.compile(optimizer=optimizer,loss=my_sparse_categorical_crossentropy)

回答2:

try to change last activation from softmax to linear

model = Sequential()
model.add(embedding_layer)
model.add(LSTM(n_hidden, return_sequences=False))
model.add(Dropout(dropout_keep_prob))
model.add(Dense(vocab_size))
model.add(Activation('linear'))
optimizer = RMSprop(lr=self.lr)

model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy')

回答3:

You can make a model for training and another for predictions.

For training, you can use the functional API model and simply take a part of the existing model, leaving the Activation aside:

model = yourExistingModelWithSoftmax 
modelForTraining = Model(model.input,model.layers[-2].output)

#use your loss function in this model:
modelForTraining.compile(optimizer=optimizer,loss=my_sparse_categorical_crossentropy, metrics=[my_custom_metric])

Since you got one model as a part of another, they both will share the same weights.

When you want to train, use modelForTraining.fit()
When you want to predict probabilities, use model.predict().

来源：https://stackoverflow.com/questions/47036409/keras-how-to-get-unnormalized-logits-instead-of-probabilities

标签

nlp

keras