Resnet50 does not converge. VGG16 works fine

一曲冷凌霜 提交于 2020-01-16 08:23:15

问题


I trained one regression network using resnet50 as backbone. The input of the network is image whose size is 224*224*3, the output of the network is one value, varying from 0 to 1.

but the netwrok can not converge, no matter I use sigmoid or relu as output layer's activation. mae or mse as loss function.

For exampple, I use resnet50 as backbone,mae as loss function, sigmoid is the activation function of output layer. SGD as optimizer. The training loss would be:

Epoch 1 training loss is 0.4900, val_loss is 0.4797

Epoch 2 training loss is 0.4923, val_loss is 0.4794

Epoch 3 training loss is 0.4923, val_loss is 0.4783

...

Epoch 35 training loss is 0.4923, val_loss is 0.4771

The training loss would not change, it is constant 0.4923. the val_loss is always about 0.47. I tested differentoptimizer, learning rate. the network is still not converge.

When I use VGG16 or Mobilenet as backbone, the network converged. Could anyone give me some suggestions about how I can fix this problem.


回答1:


Can you somehow validate if the Resnet50 backbone is correctly implemented. Maybe try to train it on MNIST and see if it works in general.

It kinda seems to me that the ResNet varaint just outputs some mean value instead of learning the actual problem.

Can you give some more information on what you want to achieve. How your regression looks like and what input is expected from the backbone. Also you might want to have a look at similar work (if that exists) and read what architectures they were using and what hyperparameters.



来源:https://stackoverflow.com/questions/59656204/resnet50-does-not-converge-vgg16-works-fine

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!