深度学习中的batch normalization

图1 算法说明

  with ops.name_scope(name, "batchnorm", [x, mean, variance, scale, offset]):     inv = math_ops.rsqrt(variance + variance_epsilon)     if scale is not None:       inv *= scale     # Note: tensorflow/contrib/quantize/python/fold_batch_norms.py depends on     # the precise order of ops that are generated by the expression below.     return x * math_ops.cast(inv, x.dtype) + math_ops.cast(         offset - mean * inv if offset is not None else -mean * inv, x.dtype)

此处代码为例。

使用位置：非线性函数之前

Y1l = tf.matmul(XX, W1) Y1bn, update_ema1 = batchnorm(Y1l, O1, S1, tst, iter) Y1 = tf.nn.sigmoid(Y1bn)

训练和测试时的不同：训练时，均值和方差由mini-batch直接获得；测试时，均值和方差通过样本的滑动平均值获得。

def batchnorm(Ylogits, Offset, Scale, is_test, iteration):     exp_moving_avg = tf.train.ExponentialMovingAverage(0.998, iteration) # adding the iteration prevents from averaging across non-existing iterations     bnepsilon = 1e-5     mean, variance = tf.nn.moments(Ylogits, [0])     update_moving_averages = exp_moving_avg.apply([mean, variance])     m = tf.cond(is_test, lambda: exp_moving_avg.average(mean), lambda: mean)     v = tf.cond(is_test, lambda: exp_moving_avg.average(variance), lambda: variance)     Ybn = tf.nn.batch_normalization(Ylogits, m, v, Offset, Scale, bnepsilon)     return Ybn, update_moving_averages

这里。

这里

参考文献：

文章来源: 深度学习中的batch normalization

标签

mean