How to add regularizations in TensorFlow?

I found in many available neural network code implemented using TensorFlow that regularization terms are often implemented by manually adding an additional term to loss value.

My questions are:

  • Is there a more elegant or recommended way of regularization than doing it manually?

  • I also find that get_variable has an argument regularizer . How should it be used? According to my observation, if we pass a regularizer to it (such as tf.contrib.layers.l2_regularizer , a tensor representing regularized term will be computed and added to a graph collection named tf.GraphKeys.REGULARIZATOIN_LOSSES . Will that collection be automatically used by TensorFlow (eg used by optimizers when training)? Or is it expected that I should use that collection by myself?


  • As you say in the second point, using the regularizer argument is the recommended way. You can use it in get_variable , or set it once in your variable_scope and have all your variables regularized.

    The losses are collected in the graph, and you need to manually add them to your cost function like this.

      reg_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
      reg_constant = 0.01  # Choose an appropriate one.
      loss = my_normal_loss + reg_constant * sum(reg_losses)
    

    Hope that helps!


    A few aspects of the existing answer were not immediately clear to me, so here is a step-by-step guide:

  • Define a regularizer. This is where the regularization constant can be set, eg:

    regularizer = tf.contrib.layers.l2_regularizer(scale=0.1)
    
  • Create variables via:

        weights = tf.get_variable(
            name="weights",
            regularizer=regularizer,
            ...
        )
    

    Equivalently, variables can be created via the regular weights = tf.Variable(...) constructor, followed by tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, weights) .

  • Define some loss term and add the regularization term:

    reg_variables = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
    reg_term = tf.contrib.layers.apply_regularization(regularizer, reg_variables)
    loss += reg_term
    

    Note: It looks like tf.contrib.layers.apply_regularization is implemented as an AddN , so more or less equivalent to sum(reg_variables) .


  • Another option to do this with the contrib.learn library is as follows, based on the Deep MNIST tutorial on the Tensorflow website. First, assuming you've imported the relevant libraries (such as import tensorflow.contrib.layers as layers ), you can define a network in a separate method:

    def easier_network(x, reg):
        """ A network based on tf.contrib.learn, with input `x`. """
        with tf.variable_scope('EasyNet'):
            out = layers.flatten(x)
            out = layers.fully_connected(out, 
                    num_outputs=200,
                    weights_initializer = layers.xavier_initializer(uniform=True),
                    weights_regularizer = layers.l2_regularizer(scale=reg),
                    activation_fn = tf.nn.tanh)
            out = layers.fully_connected(out, 
                    num_outputs=200,
                    weights_initializer = layers.xavier_initializer(uniform=True),
                    weights_regularizer = layers.l2_regularizer(scale=reg),
                    activation_fn = tf.nn.tanh)
            out = layers.fully_connected(out, 
                    num_outputs=10, # Because there are ten digits!
                    weights_initializer = layers.xavier_initializer(uniform=True),
                    weights_regularizer = layers.l2_regularizer(scale=reg),
                    activation_fn = None)
            return out 
    

    Then, in a main method, you can use the following code snippet:

    def main(_):
        mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)
        x = tf.placeholder(tf.float32, [None, 784])
        y_ = tf.placeholder(tf.float32, [None, 10])
    
        # Make a network with regularization
        y_conv = easier_network(x, FLAGS.regu)
        weights = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'EasyNet') 
        print("")
        for w in weights:
            shp = w.get_shape().as_list()
            print("- {} shape:{} size:{}".format(w.name, shp, np.prod(shp)))
        print("")
        reg_ws = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES, 'EasyNet')
        for w in reg_ws:
            shp = w.get_shape().as_list()
            print("- {} shape:{} size:{}".format(w.name, shp, np.prod(shp)))
        print("")
    
        # Make the loss function `loss_fn` with regularization.
        cross_entropy = tf.reduce_mean(
            tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))
        loss_fn = cross_entropy + tf.reduce_sum(reg_ws)
        train_step = tf.train.AdamOptimizer(1e-4).minimize(loss_fn)
    

    To get this to work you need to follow the MNIST tutorial I linked to earlier and import the relevant libraries, but it's a nice exercise to learn TensorFlow and it's easy to see how the regularization affects the output. If you apply a regularization as an argument, you can see the following:

    - EasyNet/fully_connected/weights:0 shape:[784, 200] size:156800
    - EasyNet/fully_connected/biases:0 shape:[200] size:200
    - EasyNet/fully_connected_1/weights:0 shape:[200, 200] size:40000
    - EasyNet/fully_connected_1/biases:0 shape:[200] size:200
    - EasyNet/fully_connected_2/weights:0 shape:[200, 10] size:2000
    - EasyNet/fully_connected_2/biases:0 shape:[10] size:10
    
    - EasyNet/fully_connected/kernel/Regularizer/l2_regularizer:0 shape:[] size:1.0
    - EasyNet/fully_connected_1/kernel/Regularizer/l2_regularizer:0 shape:[] size:1.0
    - EasyNet/fully_connected_2/kernel/Regularizer/l2_regularizer:0 shape:[] size:1.0
    

    Notice that the regularization portion gives you three items, based on the items available.

    With regularizations of 0, 0.0001, 0.01, and 1.0, I get test accuracy values of 0.9468, 0.9476, 0.9183, and 0.1135, respectively, showing the dangers of high regularization terms.

    链接地址: http://www.djcxy.com/p/32102.html

    上一篇: 你可以给什么样的数据类型作为TensorFlow的关键字?

    下一篇: 如何在TensorFlow中添加正则化?