I can\'t understand why dropout works like this in tensorflow. The blog of CS231n says that, \"dropout is implemented by only keeping a neuron active with some probability
If you keep reading in cs231n, the difference between dropout and inverted dropout is explained.
Since we want to leave the forward pass at test time untouched (and tweak our network just during training), tf.nn.dropout directly implements inverted dropout, scaling the values.