Tensorflow: Using Adam optimizer

后端 未结 5 1858
时光取名叫无心
时光取名叫无心 2020-11-29 20:53

I am experimenting with some simple models in tensorflow, including one that looks very similar to the first MNIST for ML Beginners example, but with a somewhat larger dimen

5条回答
  •  悲&欢浪女
    2020-11-29 21:42

    The AdamOptimizer class creates additional variables, called "slots", to hold values for the "m" and "v" accumulators.

    See the source here if you're curious, it's actually quite readable: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/training/adam.py#L39 . Other optimizers, such as Momentum and Adagrad use slots too.

    These variables must be initialized before you can train a model.

    The normal way to initialize variables is to call tf.initialize_all_variables() which adds ops to initialize the variables present in the graph when it is called.

    (Aside: unlike its name suggests, initialize_all_variables() does not initialize anything, it only add ops that will initialize the variables when run.)

    What you must do is call initialize_all_variables() after you have added the optimizer:

    ...build your model...
    # Add the optimizer
    train_op = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
    # Add the ops to initialize variables.  These will include 
    # the optimizer slots added by AdamOptimizer().
    init_op = tf.initialize_all_variables()
    
    # launch the graph in a session
    sess = tf.Session()
    # Actually intialize the variables
    sess.run(init_op)
    # now train your model
    for ...:
      sess.run(train_op)
    

提交回复
热议问题