![]() |
图1 算法说明 |
实现函数体如下:
with ops.name_scope(name, "batchnorm", [x, mean, variance, scale, offset]): inv = math_ops.rsqrt(variance + variance_epsilon) if scale is not None: inv *= scale # Note: tensorflow/contrib/quantize/python/fold_batch_norms.py depends on # the precise order of ops that are generated by the expression below. return x * math_ops.cast(inv, x.dtype) + math_ops.cast( offset - mean * inv if offset is not None else -mean * inv, x.dtype)
此处代码为例。
- 使用位置:非线性函数之前
Y1l = tf.matmul(XX, W1) Y1bn, update_ema1 = batchnorm(Y1l, O1, S1, tst, iter) Y1 = tf.nn.sigmoid(Y1bn)
- 训练和测试时的不同:训练时,均值和方差由mini-batch直接获得;测试时,均值和方差通过样本的滑动平均值获得。
def batchnorm(Ylogits, Offset, Scale, is_test, iteration): exp_moving_avg = tf.train.ExponentialMovingAverage(0.998, iteration) # adding the iteration prevents from averaging across non-existing iterations bnepsilon = 1e-5 mean, variance = tf.nn.moments(Ylogits, [0]) update_moving_averages = exp_moving_avg.apply([mean, variance]) m = tf.cond(is_test, lambda: exp_moving_avg.average(mean), lambda: mean) v = tf.cond(is_test, lambda: exp_moving_avg.average(variance), lambda: variance) Ybn = tf.nn.batch_normalization(Ylogits, m, v, Offset, Scale, bnepsilon) return Ybn, update_moving_averages
这里。
参考文献: