神经网络反向传播算法

被刻印的时光 ゝ 提交于 2020-01-12 18:36:29

神经网络反向传播算法

Alt
这是典型的三层神经网络的基本构成,LayerL1Layer L_1是输入层,LayerL1Layer L_1是隐藏层,,LayerL1Layer L_1是隐含层,我们现在手里有一堆数据{x1,x2,x3,…,xn},输出也是一堆数据{y1,y2,y3,…,yn},现在要他们在隐含层做某种变换,让你把数据灌进去后得到你期望的输出。
以下通过具体的例子说明神经网络算法的正向传播和反向传播过程:
简单的三层神经网络举例
假设存在如上的神经网络模型,期望输出O1O_1=0.01,O2O_2=0.99。以下是训练算法的具体细节:

正向传播

  1. 输入层–>隐藏层计算方式,激活函数采用sigmoid:
    neth1=w1i1+w2i2+b1=0.150.05+0.20.1+0.35=0.3775outh1=11+eneth1=11+e0.3775=0.593269992neth2=w3i1+w4i2+b1=0.250.05+0.30.1+0.35=0.3925outh2=11+eneth2=11+e0.3925=0.596884378\begin{aligned} net_{h1}&=w_1*i_1+w_2*i_2+b_1 \\ &=0.15*0.05+0.2*0.1+0.35\\ &=0.3775\\ out_{h1}&={1\over 1+e^{-net_{h1}}}\\ &={1\over1+e^{-0.3775}}=0.593269992\\ net_{h2}&=w_3*i_1+w_4*i_2+b_1 \\ &=0.25*0.05+0.3*0.1+0.35\\ &=0.3925\\ out_{h2}&={1\over 1+e^{-net_{h2}}}\\ &={1\over1+e^{-0.3925}}=0.596884378\\ \end{aligned}
    其中netinet_i表示神经元ii的输入;outiout_i表示神经元ii的输出,以下公式通用。
  2. 隐藏层–>输出层:
    neto1=outh1w5+outh2w6+b2=1.105905967outo1=11+eneto1=11+e1.105905967=0.75136507neto2=outh1w7+outh2w8+b2=1.2249214039outo2=11+eneto2=11+e1.2249214039=0.772928465\begin{aligned} net_{o1}&=out_{h1}*w_5+out_{h2}*w_6+b_2 \\ &=1.105905967\\ out_{o1}&={1\over 1+e^{-net_{o1}}}\\ &={1\over1+e^{-1.105905967}}=0.75136507\\ net_{o2}&=out_{h1}*w_7+out_{h2}*w_8+b_2 \\ &=1.2249214039\\ out_{o2}&={1\over 1+e^{-net_{o2}}}\\ &={1\over1+e^{-1.2249214039}}=0.772928465\\ \end{aligned}
    3.计算总误差:
    Etotal=i=1n12(targetioutoi)2E_{total}=\sum_{i=1}^n {1\over{2}}(target_i-out_{oi})^2
    其中targetitarget_i为期望输出,outoiout_{oi}为实际输出;
    代入公式如下:
    Eo1=12(target1outo1)2=12(0.010.75136507)2=0.274811083Eo2=12(target2outo2)2=12(0.990.772928465)2=0.023560026Etotal=Eo1+Eo2=0.298371109\begin{aligned} E_{o1}&={1\over2}(target_1-out_{o1})^2\\ &={1\over2}(0.01-0.75136507)^2=0.274811083\\ E_{o2}&={1\over2}(target_2-out_{o2})^2\\ &={1\over2}(0.99-0.772928465)^2=0.023560026\\ E_{total}&=E_{o1}+E_{o2}=0.298371109 \end{aligned}

反向传播

隐藏层–>输出层权值更新:

运用链式求导法则,激活函数采用sigmoid,进行隐藏层–输出层权值的更新:
这里以w5w_5权值更新举例:
Etotalw5=Etotalouto1outo1neto1neto1w5Etotalouto1=(i=1n12(targetioutoi)2)outo1=Eo1outo1=(target1out01)(1)=(0.010.75136507)(1)=0.74136507outo1neto1=outo1(1outo1)=0.75136507(10.75136507)=0.1868156602neto1w5=neth1=0.593269992Etotalw5=0.741365070.18681566020.593269992=0.082167041\begin{aligned} \cfrac{\partial{E_{total}}}{\partial{w_5}}&= \cfrac{\partial{E_{total}}}{\partial{out_{o1}}} *\cfrac{\partial{out_{o1}}}{\partial{net_{o1}}} *\cfrac{\partial{net_{o1}}}{\partial{w_5}}\\ \cfrac{\partial{E_{total}}}{\partial{out_{o1}}}&= \cfrac{\partial(\sum_{i=1}^n {1\over{2}}(target_i-out_{oi})^2)} {{\partial{out_{o1}}}}\\ &=\cfrac{\partial{E_{o1}}}{\partial{out_{o1}}} =(target_1-out_{01})*(-1)\\ &=(0.01-0.75136507)*(-1)=0.74136507\\ \cfrac{\partial{out_{o1}}}{\partial{net_{o1}}}&=out_{o1}*(1-out_{o1})\\ &=0.75136507*(1-0.75136507)=0.1868156602\\ \cfrac{\partial{net_{o1}}}{\partial{w_5}}&=net_{h1}=0.593269992\\ \cfrac{\partial{E_{total}}}{\partial{w_5}}&=0.74136507*0.1868156602*0.593269992\\ &=0.082167041 \end{aligned}
总结上述公式:
Etotalw5=(target1outo1)outo1(1outo1)outh1\begin{aligned} \cfrac{\partial{E_{total}}}{\partial{w_5}}=(target_1-out_{o1})*out_{o1}*(1-out_{o1})*out_{h1} \end{aligned}
δo1\delta_{o1}来表示输出层的残差:
δo1=Etotalouto1outo1neto1=Etotalneto1=(target1outo1)outo1(1outo1)\begin{aligned} \delta_{o1}&=\cfrac{\partial{E_{total}}}{\partial{out_{o1}}} *\cfrac{\partial{out_{o1}}}{\partial{net_{o1}}}=\cfrac{\partial{E_{total}}}{\partial{net_{o1}}}\\ &=(target_1-out_{o1})*out_{o1}*(1-out_{o1}) \end{aligned}
因此,整体误差对w5w_5的偏导可以记成:
Etotalw5=δo1outh1\cfrac{\partial{E_{total}}}{\partial{w_5}}=\delta_{o1}*out_{h1}
如果输出层误差计为负的话,也可以写成:
Etotalw5=(1)δo1outh1\cfrac{\partial{E_{total}}}{\partial{w_5}}=(-1)*\delta_{o1}*out_{h1}
最后,假设我们的学习率η\eta设为0.5,更新w5w_5的权值:
w5+=w5ηEtotalw5=0.40.50.082167041=0.35891648\begin{aligned} w_5^+&=w_5-\eta*\cfrac{\partial{E_{total}}}{\partial{w_5}}\\ &=0.4-0.5*0.082167041=0.35891648 \end{aligned}
按照上述方法计算w6+,w7+,w8+w_6^+,w_7^+,w_8^+
w6+=0.408666186w7+=0.511301276w8+=0.561370121\begin{aligned} w_6^+&=0.408666186\\ w_7^+&=0.511301276\\ w_8^+&=0.561370121 \end{aligned}

输入层–>隐藏层权值更新:

方法基本同上述一致,运用反复进行链式求导:
Etotalw1=Etotalouth1outh1neth1neth1w1 \cfrac{\partial{E_{total}}}{\partial{w_1}}= \cfrac{\partial{E_{total}}}{\partial{out_{h1}}} *\cfrac{\partial{out_{h1}}}{\partial{net_{h1}}} *\cfrac{\partial{net_{h1}}}{\partial{w_1}}
其中,
Etotalouth1=Eo1outh1+Eo2outh1Eo1outh1=Eo1outo1outo1neto1neto1outh1=0.741365070.186815602w5=0.055399425Eo2outh1=0.019049119Etotalouth1=0.055399425+(0.019049119)=0.036350306outh1neth1=outh1(1outh1)=0.593269992(10.593269992)=0.241300709neth1w1=ii=0.05Etotalw1=0.0363503060.2413007090.05=0.00438568\begin{aligned} \cfrac{\partial{E_{total}}}{\partial{out_{h1}}} &=\cfrac{\partial{E_{o1}}}{\partial{out_{h1}}}+ \cfrac{\partial{E_{o2}}}{\partial{out_{h1}}}\\ \cfrac{\partial{E_{o1}}}{\partial{out_{h1}}}&= \cfrac{\partial{E_{o1}}}{\partial{out_{o1}}}* \cfrac{\partial{out_{o1}}}{\partial{net_{o1}}}* \cfrac{\partial{net_{o1}}}{\partial{out_{h1}}}\\ &=0.74136507*0.186815602*w_5=0.055399425\\ \cfrac{\partial{E_{o2}}}{\partial{out_{h1}}}&=-0.019049119\\ \cfrac{\partial{E_{total}}}{\partial{out_{h1}}}&=0.055399425+(-0.019049119)=0.036350306\\ \cfrac{\partial{out_{h1}}}{\partial{net_{h1}}}&=out_{h1}*(1-out_{h1})=0.593269992*(1-0.593269992)\\ &=0.241300709\\ \cfrac{\partial{net_{h1}}}{\partial{w_1}}&=i_i=0.05\\ \cfrac{\partial{E_{total}}}{\partial{w_1}}&=0.036350306*0.241300709*0.05=0.00438568 \end{aligned}
为了简化公式,用δh1\delta_{h1}表示隐含层单元h1的误差:
Etotalw1=(i=1nEoioutoioutoinetoinetoiouth1)outh1neth1neth1w1=(i=1nδoiwh1oi)outh1(1outh1)i1=δh1i1\begin{aligned} \cfrac{\partial{E_{total}}}{\partial{w_1}}&=(\sum_{i=1}^n \cfrac{\partial{E_{o_i}}}{\partial{out_{o_i}}}* \cfrac{\partial{out_{o_i}}}{\partial{net_{o_i}}}* \cfrac{\partial{net_{o_i}}}{\partial{out_{h_1}}} )*\cfrac{\partial{out_{h_1}}}{\partial{net_{h_1}}}*\cfrac{\partial{net_{h_1}}}{\partial{w_1}}\\ &=(\sum_{i=1}^n \delta_{o_i}*w_{{h_1}{o_i}})*out_{h_1}*(1-out_{h_1})*i_1\\ &=\delta_{h1}*i_1 \end{aligned}
其中,wh1oiw_{{h_1}{o_i}}表示隐藏神经元h1h_1到输出神经元oio_i 的权值。
更新w1w_1的权值:
w1+=w1ηEtotalw1=0.150.50.00438568=0.149780716\begin{aligned} w_1^+&=w_1-\eta*\cfrac{\partial{E_{total}}}{\partial{w_1}}\\ &=0.15-0.5*0.00438568=0.149780716 \end{aligned}
其他隐藏层中权值的更新方法类似,这里就不再列举了。
以上就是神经网络算法中正向传播和反向传播的内容。

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!