How to take the average of the weights of two networks?

瘦欲@ 提交于 2019-12-21 12:08:04

问题


Suppose in PyTorch I have model1 and model2 which have the same architecture. They were further trained on same data or one model is an earlier version of the othter, but it is not technically relevant for the question. Now I want to set the weights of model to be the average of the weights of model1 and model2. How would I do that in PyTorch?


回答1:


beta = 0.5 #The interpolation parameter    
params1 = model1.named_parameters()
params2 = model2.named_parameters()

dict_params2 = dict(params2)

for name1, param1 in params1:
    if name1 in dict_params2:
        dict_params2[name1].data.copy_(beta*param1.data + (1-beta)*dict_params2[name1].data)

model.load_state_dict(dict_params2)

Taken from pytorch forums. You could grab the parameters, transform and load them back but make sure the dimensions match.

Also I would be really interested in knowing about your findings with these..



来源:https://stackoverflow.com/questions/48560227/how-to-take-the-average-of-the-weights-of-two-networks

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!