Strange behaviour of the loss function in keras model, with pretrained convolutional base

后端未结

关注

 2  1805

长情又很酷 2020-12-03 11:55

I\'m trying to create a model in Keras to make numerical predictions from the pictures. My model has densenet121 convolutional base, with couple of additional layer

2条回答

自闭症患者 (楼主)

2020-12-03 12:55
But dropout layers usually create opposite effect making loss on evaluation less than loss during training.

Not necessarily! Although in dropout layer some of the neurons are dropped, but bear in mind that the output is scaled back according to dropout rate. In inference time (i.e. test time) dropout is removed entirely and considering that you have only trained your model for just one epoch, the behavior you saw may happen. Don't forget that since you are training the model for just one epoch, only a portion of neurons have been dropped in the dropout layer but all of them are present at inference time.

If you continue training the model for more epochs you might expect that the training loss and the test loss (on the same data) becomes more or less the same.

Experiment it yourself: just set the trainable parameter of Dropout layer(s) to False and see whether this happens or not.

One may be confused (as I was) by seeing that, after one epoch of training, the training loss is not equal to evaluation loss on the same batch of data. And this is not specific to models with Dropout or BatchNormalization layers. Consider this example:
```
from keras import layers, models
import numpy as np

model = models.Sequential()
model.add(layers.Dense(1000, activation='relu', input_dim=100))
model.add(layers.Dense(1))

model.compile(loss='mse', optimizer='adam')
x = np.random.rand(32, 100)
y = np.random.rand(32, 1)

print("Training:")
model.fit(x, y, batch_size=32, epochs=1)

print("\nEvaluation:")
loss = model.evaluate(x, y)
print(loss)
```
The output:
```
Training:
Epoch 1/1
32/32 [==============================] - 0s 7ms/step - loss: 0.1520

Evaluation:
32/32 [==============================] - 0s 2ms/step
0.7577340602874756
```
So why the losses are different if they have been computed over the same data, i.e. 0.1520 != 0.7577?

If you ask this, it's because you, like me, have not paid enough attention: that 0.1520 is the loss before updating the parameters of model (i.e. before doing backward pass or backpropagation). And 0.7577 is the loss after the weights of model has been updated. Even though that the data used is the same, the state of the model when computing those loss values is not the same (Another question: so why has the loss increased after backpropagation? It is simply because you have only trained it for just one epoch and therefore the weights updates are not stable enough yet).

To confirm this, you can also use the same data batch as the validation data:
```
model.fit(x, y, batch_size=32, epochs=1, validation_data=(x,y))
```
If you run the code above with the modified line above you will get an output like this (obviously the exact values may be different for you):
```
Training:
Train on 32 samples, validate on 32 samples
Epoch 1/1
32/32 [==============================] - 0s 15ms/step - loss: 0.1273 - val_loss: 0.5344

Evaluation:
32/32 [==============================] - 0s 89us/step
0.5344240665435791
```
You see that the validation loss and evaluation loss are exactly the same: it is because the validation is performed at the end of epoch (i.e. when the model weights has already been updated).
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...