How to initialize weights in PyTorch?

后端 未结 9 1939
暗喜
暗喜 2020-11-28 01:10

How to initialize the weights and biases (for example, with He or Xavier initialization) in a network in PyTorch?

9条回答
  •  野性不改
    2020-11-28 01:30

    Cuz I haven't had the enough reputation so far, I can't add a comment under

    the answer posted by prosti in Jun 26 '19 at 13:16.

        def reset_parameters(self):
            init.kaiming_uniform_(self.weight, a=math.sqrt(3))
            if self.bias is not None:
                fan_in, _ = init._calculate_fan_in_and_fan_out(self.weight)
                bound = 1 / math.sqrt(fan_in)
                init.uniform_(self.bias, -bound, bound)
    

    But I wanna point out that actually we know some assumptions in the paper of Kaiming He, Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, are not appropriate, though it looks like the deliberately designed initialization method makes a hit in practice.

    E.g., within the subsection of Backward Propagation Case, they assume that $w_l$ and $\delta y_l$ are independent of each other. But as we all known, take the score map $\delta y^L_i$ as an instance, it often is $y_i-softmax(y^L_i)=y_i-softmax(w^L_ix^L_i)$ if we use a typical cross entropy loss function objective.

    So I think the true underlying reason why He's Initialization works well remains to unravel. Cuz everyone has witnessed its power on boosting deep learning training.

提交回复
热议问题