What is the default kernel initializer in tf.layers.conv2d and tf.layers.dense?

前端未结

关注

 4  626

The official Tensorflow API doc claims that the parameter kernel_initializer defaults to None for tf.layers.conv2d and tf.layers


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  暖寄归人        
                
              
                            
                2020-12-04 12:20
              
            
            
                                                                       


Great question! It is quite a trick to find out!


As you can see, it is not documented in tf.layers.conv2d
If you look at the definition of the function you see that the function calls variable_scope.get_variable:  


In code: 

self.kernel = vs.get_variable('kernel',
                                  shape=kernel_shape,
                                  initializer=self.kernel_initializer,
                                  regularizer=self.kernel_regularizer,
                                  trainable=True,
                                  dtype=self.dtype)


Next step: what does the variable scope do when the initializer is None?

Here it says: 


  If initializer is None (the default), the default initializer passed in
      the constructor is used. If that one is None too, we use a new
      glorot_uniform_initializer.


So the answer is: it uses the glorot_uniform_initializer

For completeness the definition of this initializer: 


  The Glorot uniform initializer, also called Xavier uniform initializer.
    It draws samples from a uniform distribution within [-limit, limit]
    where limit is sqrt(6 / (fan_in + fan_out))
    where fan_in is the number of input units in the weight tensor
    and fan_out is the number of output units in the weight tensor.
    Reference: http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf


Edit: this is what I found in the code and documentation. Perhaps you could verify that the initialization looks like this by running eval on the weights!
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  隐瞒了意图╮        
                
              
                            
                2020-12-04 12:21
              
            
            
                                                                       
2.0 Compatible Answer: Even in Tensorflow 2.0, the Default Kernel Initializer in tf.keras.layers.Conv2D and tf.keras.layers.Dense is glorot_uniform.

This is specified in the Tensorflow.org Website. 

Link for Conv2D is https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2D?version=nightly#init

and the Link for Dense is 

https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense?version=nightly#init
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  花落未央        
                
              
                            
                2020-12-04 12:24
              
            
            
                                                                       
According to this course by Andrew Ng and the Xavier documentation, if you are using ReLU as activation function, better change the default weights initializer(which is Xavier uniform) to Xavier normal by:

y = tf.layers.conv2d(x, kernel_initializer=tf.contrib.layers.xavier_initializer(uniform=False), )

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  广开言路        
                
              
                            
                2020-12-04 12:43
              
            
            
                                                                       
In CNN, kernels values are initialized randomly. Then the values will be readjusted during backpropagation to yield better edge detection(!) kernels. See this
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复