Model with BatchNormalization: stagnant test loss

三世轮回 提交于 2019-12-23 01:44:03

问题


I wrote a neural network using Keras. It contains BatchNormalization layers.

When I trained it with model.fit, everything was fine. When training it with tensorflow as explained here, the training is fine, but the validation step always give very poor performance, and it quickly saturates (the accuracy goes 5%, 10%, 40%, 40%, 40%..; the loss is stagnant too).

I need to use tensorflow because it allows more flexibility regarding the monitoring part of training.

I strongly suspect it has something to do with BN layers or/and the way I compute the test performances (see below)

feed_dict = {x: X_valid,
            batch_size_placeholder: X_valid.shape[0],
            K.learning_phase(): 0,
            beta: self.warm_up_schedule(global_step)
            }
if self.weights is not None:
    feed_dict[weights] = self.weights
acc = accuracy.eval(feed_dict=feed_dict)

Is there anything special to do when computing the validation accuracy of a model containing Keras BatchNormalizatin layers ?

Thank you in advance !


回答1:


Actually I found out about the training argument of the __call__ method of the BatchNormalization layer

So what you can do when instantiating the layer is just:

x = Input((dim1, dim2))
h = Dense(dim3)(x)
h = BatchNormalization()(h, training=K.learning_phase())

And when evaluating the performance on validation set:

feed_dict = {x: X_valid,
             batch_size_placeholder: X_valid.shape[0],
             K.learning_phase(): 0,
             beta: self.warm_up_schedule(global_step)
             }
acc = accuracy.eval(feed_dict=feed_dict)
summary_ = merged.eval(feed_dict=feed_dict)
test_writer.add_summary(summary_, global_step)


来源:https://stackoverflow.com/questions/43654483/model-with-batchnormalization-stagnant-test-loss

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!