Tensorflow - Averaging model weights from restored models

孤街浪徒 提交于 2019-12-24 04:19:13

问题


Given that I trained several different models on the same data and all the neural networks I trained have the same architecture I would like to know if it's possible to restore those models, average their weights and initialise my weights using the average.

This is an example of how the graph might look. Basically what I need is an average of the weights I am going to load.

import tensorflow as tf
import numpy as np

#init model1 weights
weights = {
    'w1': tf.Variable(),
    'w2': tf.Variable()
}
# init model1 biases
biases = {
    'b1': tf.Variable(),
    'b2': tf.Variable()
}
#init model2 weights
weights2 = {
    'w1': tf.Variable(),
    'w2': tf.Variable()
}
# init model2 biases
biases2 = {
    'b1': tf.Variable(),
    'b2': tf.Variable(),
}

# this the average I want to create
w = {
    'w1': tf.Variable(
        tf.add(weights["w1"], weights2["w1"])/2
    ),
    'w2': tf.Variable(
        tf.add(weights["w2"], weights2["w2"])/2
    ),
    'w3': tf.Variable(
        tf.add(weights["w3"], weights2["w3"])/2
    )
}
# init biases
b = {
    'b1': tf.Variable(
        tf.add(biases["b1"], biases2["b1"])/2
    ),
    'b2': tf.Variable(
        tf.add(biases["b2"], biases2["b2"])/2
    ),
    'b3': tf.Variable(
        tf.add(biases["b3"], biases2["b3"])/2
    )
}

weights_saver = tf.train.Saver({
    'w1' : weights['w1'],
    'w2' : weights['w2'],
    'b1' : biases['b1'],
    'b2' : biases['b2']
    })
weights_saver2 = tf.train.Saver({
    'w1' : weights2['w1'],
    'w2' : weights2['w2'],
    'b1' : biases2['b1'],
    'b2' : biases2['b2']
    })

And this what I am want to get when I run the tf session. c contains the weights I want to use in order to start the training.

# Create a session for running operations in the Graph.
init_op = tf.global_variables_initializer()
init_op2 = tf.local_variables_initializer()

with tf.Session() as sess:
    coord = tf.train.Coordinator()
    # Initialize the variables (like the epoch counter).
    sess.run(init_op)
    sess.run(init_op2)
    weights_saver.restore(
        sess,
        'my_model1/model_weights.ckpt'
    )
    weights_saver2.restore(
        sess,
        'my_model2/model_weights.ckpt'
    )
    a = sess.run(weights)
    b = sess.run(weights2)
    c = sess.run(w)

回答1:


First, I assume the model structure is exactly the same (same number of layers, same number of nodes/layer). If not they you will have problems mapping variables (there will be variables in one model but not in the other.

What you want to do is have 3 sessions. First 2 you load from checkpoints, the last one will hold the average. You want this because each session will contain a version of the values of the variables.

After you load a model use tf.trainable_variables() to get a list of all the variables in the model. You can pass it to sess.run to get the variables as numpy arrays. After you compute the averages use tf.assign to create operations to change the variables. You can also use the list to change the initializers, but that means passing in to the model (not always an option).

Roughly:

graph = tf.Graph()
session1 = tf.Session()
session2 = tf.Session()
session3 = tf.Session()

# Omitted code: Restore session1 and session2.
# Optionally initialize session3.

all_vars = tf.trainable_variables()
values1 = session1.run(all_vars)
values2 = session2.run(all_vars)

all_assign = []
for var, val1, val2 in zip(all_vars, values1, values2):
  all_assign.append(tf.assign(var, (val1 + val2)/ 2))

session3.run(all_assign)

# Do whatever you want with session 3.


来源:https://stackoverflow.com/questions/50373678/tensorflow-averaging-model-weights-from-restored-models

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!