I am training a deep neural network which consists of 7 layers (4 x conv2d and 3 fully connected). All layers use relu as activation function. Now I see that the output valu