Convert Tensorflow model to Caffe model

|▌冷眼眸甩不掉的悲伤 提交于 2020-12-27 08:47:38

问题


I would like to be able to convert a Tensorflow model to Caffe model.

I searched on google but I was able to find only converters from caffe to tensorflow but not the opposite.

Does anyone have an idea on how to do it?

Thanks, Evi


回答1:


I've had the same problem and found a solution. The code can be found here (https://github.com/lFatality/tensorflow2caffe) and I've also documented the code in some Youtube videos.


Part 1 covers the creation of the architecture of VGG-19 in Caffe and tflearn (higher level API for TensorFlow, with some changes to the code native TensorFlow should also work).


In Part 2 the export of the weights and biases out of the TensorFlow model into a numpy file is described. In tflearn you can get the weights of a layer like this:

#get parameters of a certain layer
conv2d_vars = tflearn.variables.get_layer_variables_by_name(layer_name)
#get weights out of the parameters
weights = model.get_weights(conv2d_vars[0])
#get biases out of the parameters
biases = model.get_weights(conv2d_vars[1])

For a convolutional layer, the layer_name is Conv_2D. Fully-Connected layers are called FullyConnected. If you use more than one layer of a certain type, a raising integer with a preceding underscore is used (e.g. the 2nd conv layer is called Conv_2D_1). I've found these names in the graph of the TensorBoard. If you name the layers in your architecture definition, then these layer_names might change to the names you defined.

In native TensorFlow the export will need different code but the format of the parameters should be the same so subsequent steps should still be applicable.


Part 3 covers the actual conversion. What's critical is the conversion of the weights when you create the caffemodel (the biases can be carried over without change). TensorFlow and Caffe use different formats when saving a filter. While TensorFlow uses [height, width, depth, number of filters] (TensorFlow docs, at the bottom), Caffe uses [number of filters, depth, height, width] (Caffe docs, chapter 'Blob storage and communication'). To convert between the formats you can use the transpose function (for example: weights_of_first_conv_layer.transpose((3,2,0,1)). The 3,2,0,1 sequence can be obtained by enumerating the TensorFlow format (origin) and then switching it to the Caffe format (target format) while keeping the numbers at their specific variable.).
If you want to connect a tensor output to a fully-connected layer, things get a little tricky. If you use VGG-19 with an input size of 112x112 it looks like this.

fc1_weights = data_file[16][0].reshape((4,4,512,4096))
fc1_weights = fc1_w.transpose((3,2,0,1))
fc1_weights = fc1_w.reshape((4096,8192))

What you get from TensorFlow if you export the parameters at the connection between tensor and fully-connected layer is an array with the shape [entries in the tensor, units in the fc-layer] (here: [8192, 4096]). You have to find out what the shape of your output tensor is and then reshape the array so that it fits the TensorFlow format (see above, number of filters being the number of units in the fc-layer). After that you use the transpose-conversion you've used previously and then reshape the array again, but the other way around. While TensorFlow saves fc-layer weights as [number of inputs, number of outputs], Caffe does it the other way around.
If you connect two fc-layers to each other, you don't have to do the complex process previously described but you will have to account for the different fc-layer format by transposing again (fc_layer_weights.transpose((1,0)))

You can then set the parameters of the network using

net.params['layer_name_in_prototxt'][0].data[...] = weights
net.params['layer_name_in_prototxt'][1].data[...] = biases

This was a quick overview. If you want all the code, it's in my github repository. I hope it helps. :)


Cheers,
Fatality




回答2:


As suggested in the comment by @Patwie, you have to do it manually by copying the weights layer by layer. For example, to copy the first conv layer weights from a tensorflow checkpoint to a caffemodel, you have to do something like following:

sess = tf.Session()
new_saver = tf.train.import_meta_graph("/path/to/checkpoint.meta")
what = new_saver.restore(sess, "/path/to/checkpoint")

all_vars = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)

conv1 = all_vars[0]
bias1 = all_vars[1]

conv_w1, bias_1 = sess.run([conv1,bias1])

net = caffe.Net('path/to/conv.prototxt', caffe.TEST)

net.params['conv_1'][0].data[...] = conv_w1
net.params['conv_1'][1].data[...] = bias_1

...

net.save('modelfromtf.caffemodel')

Note1: This code has NOT been tested. I am not sure if this will work, but I think it should. Also, this is for one conv layer, only. In practice, you have to first analyse your tensorflow checkpoint to check which layer weights are at which index(print all_vars) and then copy each layer's weights individually.

Note2: Some automation can be done by iterating over the initial conv layers as they generally follow a set pattern (conv1->bn1->relu1->conv2->bn2->relu2...)

Note3: Tensorflow may further divide each layer weights into separate indices. For example: weights and biases are separated for a conv layer as shown above. Also, gamma, mean and variance are separated for batch normalisation layer.




回答3:


You can use the utility MMDNN developed by Microsoft. MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models.



来源:https://stackoverflow.com/questions/41138481/convert-tensorflow-model-to-caffe-model

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!