How to add regularizations in TensorFlow?

后端 未结 10 1330
慢半拍i
慢半拍i 2020-12-07 06:55

I found in many available neural network code implemented using TensorFlow that regularization terms are often implemented by manually adding an additional term to loss valu

相关标签:
10条回答
  • 2020-12-07 07:30

    If anyone's still looking, I'd just like to add on that in tf.keras you may add weight regularization by passing them as arguments in your layers. An example of adding L2 regularization taken wholesale from the Tensorflow Keras Tutorials site:

    model = keras.models.Sequential([
        keras.layers.Dense(16, kernel_regularizer=keras.regularizers.l2(0.001),
                           activation=tf.nn.relu, input_shape=(NUM_WORDS,)),
        keras.layers.Dense(16, kernel_regularizer=keras.regularizers.l2(0.001),
                           activation=tf.nn.relu),
        keras.layers.Dense(1, activation=tf.nn.sigmoid)
    ])
    

    There's no need to manually add in the regularization losses with this method as far as I know.

    Reference: https://www.tensorflow.org/tutorials/keras/overfit_and_underfit#add_weight_regularization

    0 讨论(0)
  • 2020-12-07 07:33

    I tested tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES) and tf.losses.get_regularization_loss() with one l2_regularizer in the graph, and found that they return the same value. By observing the value's quantity, I guess reg_constant has already make sense on the value by setting the parameter of tf.contrib.layers.l2_regularizer.

    0 讨论(0)
  • 2020-12-07 07:35

    Some answers make me more confused.Here I give two methods to make it clearly.

    #1.adding all regs by hand
    var1 = tf.get_variable(name='v1',shape=[1],dtype=tf.float32)
    var2 = tf.Variable(name='v2',initial_value=1.0,dtype=tf.float32)
    regularizer = tf.contrib.layers.l1_regularizer(0.1)
    reg_term = tf.contrib.layers.apply_regularization(regularizer,[var1,var2])
    #here reg_term is a scalar
    
    #2.auto added and read,but using get_variable
    with tf.variable_scope('x',
            regularizer=tf.contrib.layers.l2_regularizer(0.1)):
        var1 = tf.get_variable(name='v1',shape=[1],dtype=tf.float32)
        var2 = tf.get_variable(name='v2',shape=[1],dtype=tf.float32)
    reg_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
    #here reg_losses is a list,should be summed 
    

    Then,it can be added into the total loss

    0 讨论(0)
  • 2020-12-07 07:35
    cross_entropy = tf.losses.softmax_cross_entropy(
      logits=logits, onehot_labels=labels)
    
    l2_loss = weight_decay * tf.add_n(
         [tf.nn.l2_loss(tf.cast(v, tf.float32)) for v in tf.trainable_variables()])
    
    loss = cross_entropy + l2_loss
    
    0 讨论(0)
  • 2020-12-07 07:39

    As you say in the second point, using the regularizer argument is the recommended way. You can use it in get_variable, or set it once in your variable_scope and have all your variables regularized.

    The losses are collected in the graph, and you need to manually add them to your cost function like this.

      reg_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
      reg_constant = 0.01  # Choose an appropriate one.
      loss = my_normal_loss + reg_constant * sum(reg_losses)
    

    Hope that helps!

    0 讨论(0)
  • 2020-12-07 07:39

    Another option to do this with the contrib.learn library is as follows, based on the Deep MNIST tutorial on the Tensorflow website. First, assuming you've imported the relevant libraries (such as import tensorflow.contrib.layers as layers), you can define a network in a separate method:

    def easier_network(x, reg):
        """ A network based on tf.contrib.learn, with input `x`. """
        with tf.variable_scope('EasyNet'):
            out = layers.flatten(x)
            out = layers.fully_connected(out, 
                    num_outputs=200,
                    weights_initializer = layers.xavier_initializer(uniform=True),
                    weights_regularizer = layers.l2_regularizer(scale=reg),
                    activation_fn = tf.nn.tanh)
            out = layers.fully_connected(out, 
                    num_outputs=200,
                    weights_initializer = layers.xavier_initializer(uniform=True),
                    weights_regularizer = layers.l2_regularizer(scale=reg),
                    activation_fn = tf.nn.tanh)
            out = layers.fully_connected(out, 
                    num_outputs=10, # Because there are ten digits!
                    weights_initializer = layers.xavier_initializer(uniform=True),
                    weights_regularizer = layers.l2_regularizer(scale=reg),
                    activation_fn = None)
            return out 
    

    Then, in a main method, you can use the following code snippet:

    def main(_):
        mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)
        x = tf.placeholder(tf.float32, [None, 784])
        y_ = tf.placeholder(tf.float32, [None, 10])
    
        # Make a network with regularization
        y_conv = easier_network(x, FLAGS.regu)
        weights = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'EasyNet') 
        print("")
        for w in weights:
            shp = w.get_shape().as_list()
            print("- {} shape:{} size:{}".format(w.name, shp, np.prod(shp)))
        print("")
        reg_ws = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES, 'EasyNet')
        for w in reg_ws:
            shp = w.get_shape().as_list()
            print("- {} shape:{} size:{}".format(w.name, shp, np.prod(shp)))
        print("")
    
        # Make the loss function `loss_fn` with regularization.
        cross_entropy = tf.reduce_mean(
            tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))
        loss_fn = cross_entropy + tf.reduce_sum(reg_ws)
        train_step = tf.train.AdamOptimizer(1e-4).minimize(loss_fn)
    

    To get this to work you need to follow the MNIST tutorial I linked to earlier and import the relevant libraries, but it's a nice exercise to learn TensorFlow and it's easy to see how the regularization affects the output. If you apply a regularization as an argument, you can see the following:

    - EasyNet/fully_connected/weights:0 shape:[784, 200] size:156800
    - EasyNet/fully_connected/biases:0 shape:[200] size:200
    - EasyNet/fully_connected_1/weights:0 shape:[200, 200] size:40000
    - EasyNet/fully_connected_1/biases:0 shape:[200] size:200
    - EasyNet/fully_connected_2/weights:0 shape:[200, 10] size:2000
    - EasyNet/fully_connected_2/biases:0 shape:[10] size:10
    
    - EasyNet/fully_connected/kernel/Regularizer/l2_regularizer:0 shape:[] size:1.0
    - EasyNet/fully_connected_1/kernel/Regularizer/l2_regularizer:0 shape:[] size:1.0
    - EasyNet/fully_connected_2/kernel/Regularizer/l2_regularizer:0 shape:[] size:1.0
    

    Notice that the regularization portion gives you three items, based on the items available.

    With regularizations of 0, 0.0001, 0.01, and 1.0, I get test accuracy values of 0.9468, 0.9476, 0.9183, and 0.1135, respectively, showing the dangers of high regularization terms.

    0 讨论(0)
提交回复
热议问题