I am experimenting with some simple models in tensorflow, including one that looks very similar to the first MNIST for ML Beginners example, but with a somewhat larger dimen
The AdamOptimizer class creates additional variables, called "slots", to hold values for the "m" and "v" accumulators.
See the source here if you're curious, it's actually quite readable: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/training/adam.py#L39 . Other optimizers, such as Momentum and Adagrad use slots too.
These variables must be initialized before you can train a model.
The normal way to initialize variables is to call tf.initialize_all_variables() which adds ops to initialize the variables present in the graph when it is called.
(Aside: unlike its name suggests, initialize_all_variables() does not initialize anything, it only add ops that will initialize the variables when run.)
What you must do is call initialize_all_variables() after you have added the optimizer:
...build your model...
# Add the optimizer
train_op = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
# Add the ops to initialize variables. These will include
# the optimizer slots added by AdamOptimizer().
init_op = tf.initialize_all_variables()
# launch the graph in a session
sess = tf.Session()
# Actually intialize the variables
sess.run(init_op)
# now train your model
for ...:
sess.run(train_op)