样本(\(x_{i}\),\(y_{i}\))个数为\(m\):
\[\{x_{1},x_{2},x_{3}...x_{m}\}\]
\[\{y_{1},y_{2},y_{3}...y_{m}\}\]
其中\(x_{i}\)为\(n-1\)维向量(在最后添加一个1,和\(w\)的维度对齐,用于向量相乘):
\[x_{i}=\{x_{i1},x_{i2},x_{i3}...x_{i(n-1)},1\}\]
其中\(w\)为\(n\)维向量:
\[w=\{w_{1},w_{2},w_{3}...w_{n}\}\]
回归函数:
\[h_{w}(x_{i})=wx_{i}\]
损失函数:
\[J(w)=\frac{1}{2}\sum_{i=1}^{m}(h_{w}(x_{i})-y_{i})^2\]
\[求w->min_{J(w)}\]
损失函数对\(w\)中的每个\(w_{j}\)求偏导数:
\[\frac{\partial J(w)}{\partial w_{j}}=\frac{\partial}{\partial w_{j}}\sum_{i=1}^{m}(h_{w}(x_{i})-y_{i})^2\]
\[=\frac{1}{2}*2*\sum_{i=1}^{m}(h_{w}(x_{i})-y_{i})*\frac{\partial (h_{w}(x_{i})-y_{i})}{\partial w_{j}}\]
\[=\sum_{i=1}^{m}(h_{w}(x_{i})-y_{i})*\frac{\partial (wx_{i}-y_{i})}{\partial w_{j}}\]
\[\frac{\partial J(w)}{\partial w_{j}}=\sum_{i=1}^{m}(h_{w}(x_{i})-y_{i})*x_{ij}\]
更新\(w\)中的每个\(w_{j}\)的值,其中\(\alpha\)为学习速度:
\[w_{j}:=w_{j}-\alpha*\frac{\partial J(w)}{\partial w_{j}}\]
批量梯度下降:使用所有样本值进行更新\(w\)中的每个\(w_{j}\)的值
\[w_{j}:=w_{j}-\alpha*\sum_{i=1}^{m}(h_{w}(x_{i})-y_{i})*x_{ij}\]
来源:https://www.cnblogs.com/smallredness/p/11027873.html