Tensorflow: How can I assign numpy pre-trained weights to subsections of graph?

▼魔方 西西 提交于 2021-02-07 05:21:51

问题


This is a simple thing which I just couldn't figure out how to do.

I converted a pre-trained VGG caffe model to tensorflow using the github code from https://github.com/ethereon/caffe-tensorflow and saved it to vgg16.npy...

I then load the network to my sess default session as "net" using:

images = tf.placeholder(tf.float32, [1, 224, 224, 3])
net = VGGNet_xavier({'data': images, 'label' : 1}) 
with tf.Session() as sess:
  net.load("vgg16.npy", sess) 

After net.load, I get a graph with a list of tensors. I can access individual tensors per layer using net.layers['conv1_1']... to get weights and biases for the first VGG convolutional layer, etc.

Now suppose that I make another graph that has as its first layer "h_conv1_b":

  W_conv1_b = weight_variable([3,3,3,64])
  b_conv1_b = bias_variable([64])
  h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)

My question is -- how do you get to assign the pre-trained weights from net.layers['conv1_1'] to h_conv1_b ?? (both are now tensors)


回答1:


I suggest you have a detailed look at network.py from the https://github.com/ethereon/caffe-tensorflow, especially the function load(). It would help you understand what happened when you called net.load(weight_path, session).

FYI, variables in Tensorflow can be assigned to a numpy array by using var.assign(np_array) which is executed in the session. Here is the solution to your question:

with tf.Session() as sess:    
  W_conv1_b = weight_variable([3,3,3,64])
  sess.run(W_conv1_b.assign(net.layers['conv1_1'].weights))
  b_conv1_b = bias_variable([64])
  sess.run(b_conv1_b.assign(net.layers['conv1_1'].biases))
  h_conv1_b = tf.nn.relu(conv2d(im_batch, W_conv1_b) + b_conv1_b)

I would like to kindly remind you the following points:

  1. var.assign(data) where 'data' is a numpy array and 'var' is a TensorFlow variable should be executed in the same session where you want to continue to execute your network either inference or training.
  2. The 'var' should be created as the same shape as the 'data' by default. Therefore, if you can obtain the 'data' before creating the 'var', I suggest you create the 'var' by the method var=tf.Variable(shape=data.shape). Otherwise, you need to create the 'var' by the method var=tf.Variable(validate_shape=False), which means the variable shape is feasible. Detailed explainations can be found in the Tensorflow's API doc.

I extend the same repo caffe-tensorflow to support theano in caffe so that I can load the transformed model from caffe in Theano. Therefore, I am a reasonable expert w.r.t this repo's code. Please feel free to get in contact with me as you have any further question.




回答2:


You can get variable values using eval method of tf.Variable-s from the first network and load that values into variables of the second network using load method (also method of the tf.Variable).



来源:https://stackoverflow.com/questions/39555256/tensorflow-how-can-i-assign-numpy-pre-trained-weights-to-subsections-of-graph

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!