What is the default kernel initializer in tf.layers.conv2d and tf.layers.dense?

前端 未结 4 618
不知归路
不知归路 2020-12-04 11:51

The official Tensorflow API doc claims that the parameter kernel_initializer defaults to None for tf.layers.conv2d and tf.layers

4条回答
  •  暖寄归人
    2020-12-04 12:20

    Great question! It is quite a trick to find out!

    • As you can see, it is not documented in tf.layers.conv2d
    • If you look at the definition of the function you see that the function calls variable_scope.get_variable:

    In code:

    self.kernel = vs.get_variable('kernel',
                                      shape=kernel_shape,
                                      initializer=self.kernel_initializer,
                                      regularizer=self.kernel_regularizer,
                                      trainable=True,
                                      dtype=self.dtype)
    

    Next step: what does the variable scope do when the initializer is None?

    Here it says:

    If initializer is None (the default), the default initializer passed in the constructor is used. If that one is None too, we use a new glorot_uniform_initializer.

    So the answer is: it uses the glorot_uniform_initializer

    For completeness the definition of this initializer:

    The Glorot uniform initializer, also called Xavier uniform initializer. It draws samples from a uniform distribution within [-limit, limit] where limit is sqrt(6 / (fan_in + fan_out)) where fan_in is the number of input units in the weight tensor and fan_out is the number of output units in the weight tensor. Reference: http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf

    Edit: this is what I found in the code and documentation. Perhaps you could verify that the initialization looks like this by running eval on the weights!

提交回复
热议问题