Unexpected layers were generated in the mnist example in Tensorboard

问题

In order to learn tensorflow, I executed this tensorflow official mnist script (cnn_mnist.py) and displayed the graph with tensorboard.

The following is part of the code. This network contains two conv layers and two dense layers.

conv1 = tf.layers.conv2d(inputs=input_layer,filters=32,kernel_size=[5, 5],
      padding="same",activation=tf.nn.relu)

pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)

conv2 = tf.layers.conv2d(inputs=pool1,filters=64,kernel_size=[5, 5],
      padding="same",activation=tf.nn.relu)

pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)

pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64])

dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)

dropout = tf.layers.dropout(
      inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN)

logits = tf.layers.dense(inputs=dropout, units=10)

However, looking at the graph generated by tensorboard, there are three conv layers and three dense layers. I did not expect that conv2d_1 and dense_1 will be generated.

Why was conv2d_1 and dense_1 generated ?

回答1:

This is a good question, because it sheds some light into inner structure of tf.layers wrappers. Let's run two experiments:

Run the model exactly as in the question.
Add explicit names to the layers via name argument and run again.

The graph without layers' names

That's the same graph as your, but I expanded and zoomed in to the logits dense layers. Note that dense_1 contains the layer variables (kernel and bias) and dense_2 contains the ops (matrix multiplication and addition).

This means that this is still one layer, but with two naming scopes - dense_1 and dense_2. This happens because this is the second dense layer, and the first one already used the naming scope dense. Variables creation is separated from the actual layer logic - there's build and call method, - and they both try to get a unique name for the scope. This leads to dense_1 and dense_2 holding variables and ops respectively.

The graph with names specified

Now let's add name='logits' to the same layer and run again:

logits = tf.layers.dense(inputs=dropout, units=10, name='logits')

You can see there're still 2 variables and 2 ops, but the layer managed to grab one unique name for the scope (logits) and put everything inside.

Conclusion

This is a good example why explicit naming in tensorflow is beneficial, no matter if it's about tensors directly or higher-level layer. There is much less confusion, when the model uses meaningful names, instead of automatically generated ones.

回答2:

They are just the variables creations and other hidden operations that happen in tf.layers.conv2d other than just the convolution operation itself (tf.nn.conv2d) and the activation (same goes for the dense layer). There are only 2 convolutions happening: as you can see, if you follow your data in the graph, it never goes through conv2D_1 or dense_1, it's just that the result of these ops (basically the variables needed for the convolution) are also given as input to the convolution operation itself. I'm actually more surprised to not see the same thing appearing for conv_2d though, but I really wouldn't worry for that !

来源：https://stackoverflow.com/questions/48545477/unexpected-layers-were-generated-in-the-mnist-example-in-tensorboard

标签

python

tensorflow

deep-learning

conv-neural-network

tensorboard