tensorflowVariable RNNLM/RNNLM/embedding/Adam_2/ does not exist

问题

My problem is quite similar to tensorflow embeddings don't exist after first RNN example. But I don't think I get a answer.

I posted my entire file on https://paste.ubuntu.com/24253170/. But I believe the following code really matter.

I get this error message:

    ValueError: Variable RNNLM/RNNLM/embedding/Adam_2/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

The last line of this code is where this error occured:

def test_RNNLM():
  config = Config()
  gen_config = deepcopy(config)
  gen_config.batch_size = gen_config.num_steps = 1

  # We create the training model and generative model
  with tf.variable_scope('RNNLM') as scope:
      model = RNNLM_Model(config)
  # This instructs gen_model to reuse the same variables as the model above
      scope.reuse_variables()
      gen_model = RNNLM_Model(gen_config)

I wanna make the gen_model share the same variables as model. then I get the error. But if I change my code like this:

with tf.variable_scope('RNNLM') as scope:
    model = RNNLM_Model(config)
with tf.variable_scope('RNNLM') as scope:
    scope.reuse_variables()
    gen_model = RNNLM_Model(gen_config)

then I won't get error message. I don't know why, perhaps I mess up the variable_scope in my code.

Here is the constructor:

def __init__(self, config):
    self.config = config
    self.load_data(debug=True)
    self.add_placeholders()
    self.inputs = self.add_embedding()
    self.rnn_outputs = self.add_model(self.inputs)
    self.outputs = self.add_projection(self.rnn_outputs)

    # We want to check how well we correctly predict the next word
    # We cast o to float64 as there are numerical issues at hand
    # (i.e. sum(output of softmax) = 1.00000298179 and not 1)
    self.predictions = [tf.nn.softmax(tf.cast(o, 'float64')) for o in self.outputs]
    # Reshape the output into len(vocab) sized chunks - the -1 says as many as
    # needed to evenly divide
    output = tf.reshape(tf.concat(self.outputs, 1), [-1, len(self.vocab)])
    self.calculate_loss = self.add_loss_op(output)
    self.train_step = self.add_training_op(self.calculate_loss)

I found the error occured at the last line when I wanna create gen_model.

  def add_training_op(self, loss):
    """Sets up the training Ops.

    Creates an optimizer and applies the gradients to all trainable variables.
    The Op returned by this function is what must be passed to the
    `sess.run()` call to cause the model to train. See 

    https://www.tensorflow.org/versions/r0.7/api_docs/python/train.html#Optimizer

    for more information.

    Hint: Use tf.train.AdamOptimizer for this model.
          Calling optimizer.minimize() will return a train_op object.

    Args:
      loss: Loss tensor, from cross_entropy_loss.
    Returns:
      train_op: The Op for training.
    """
    opt = tf.train.AdamOptimizer(self.config.lr)
    train_op = opt.minimize(loss)
    # train_op = tf.train.AdamOptimizer(self.config.lr).minimize(loss)
    return train_op

This is loss op:

def add_loss_op(self, output):
    """Adds loss ops to the computational graph.

    Hint: Use tensorflow.python.ops.seq2seq.sequence_loss to implement sequence loss. 

    Args:
      output: A tensor of shape (None, self.vocab)
    Returns:
      loss: A 0-d tensor (scalar)
    """
    output = tf.reshape(output, [self.config.batch_size, self.config.num_steps, len(self.vocab)])
    weights = tf.ones(shape=[self.config.batch_size, self.config.num_steps])
    loss = tf.contrib.seq2seq.sequence_loss(output, self.labels_placeholder, weights)
    return loss

this part is add_embedding, the error message contains "embedding", and if I change the second line name="embedding" in this part, the error message will change too. But I don't know why.

  def add_embedding(self):
    """Add embedding layer.

    Hint: This layer should use the input_placeholder to index into the
          embedding.
    Hint: You might find tf.nn.embedding_lookup useful.
    Hint: You might find tf.split, tf.squeeze useful in constructing tensor inputs
    Hint: Check the last slide from the TensorFlow lecture.
    Hint: Here are the dimensions of the variables you will need to create:

      L: (len(self.vocab), embed_size)

    Returns:
      inputs: List of length num_steps, each of whose elements should be
              a tensor of shape (batch_size, embed_size).
    """
    # The embedding lookup is currently only implemented for the CPU
    with tf.device('/cpu:0'):
      embedding = tf.get_variable(name="embedding",
                                 shape=(len(self.vocab), self.config.embed_size),
                                  dtype=tf.float32,
                                  initializer=xavier_weight_init())

      lookup = tf.nn.embedding_lookup(params=embedding, ids=self.input_placeholder)
      wordsPerTime = tf.split(lookup, self.config.num_steps, axis=1)
      inputs = [tf.squeeze(word, axis=1) for word in wordsPerTime]
      return inputs

回答1:

I had the similar problem. The problem was corrected when optimizer is put in variable_scope like this
with tf.variable_scope("Optimizer") as scope: train_op = tf.train.AdamOptimizer(self.config.lr).minimize(loss)

回答2:

I know what's going on here, this is constructor code:(BTW, I use tensorflow 1.0)

  def __init__(self, config):
    self.config = config
    self.load_data(debug=True)
    self.add_placeholders()
    self.inputs = self.add_embedding()
    self.rnn_outputs = self.add_model(self.inputs)
    self.outputs = self.add_projection(self.rnn_outputs)

    # We want to check how well we correctly predict the next word
    # We cast o to float64 as there are numerical issues at hand
    # (i.e. sum(output of softmax) = 1.00000298179 and not 1)
    self.predictions = [tf.nn.softmax(tf.cast(o, 'float64')) for o in self.outputs]
    # Reshape the output into len(vocab) sized chunks - the -1 says as many as
    # needed to evenly divide
    output = tf.reshape(tf.concat(self.outputs, 1), [-1, len(self.vocab)])
    self.calculate_loss = self.add_loss_op(output)
    ##############!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    # self.train_step = self.add_training_op(self.calculate_loss)

the last line is going to add train_step()(to create an optimizer to minimize the loss). If the last line didn't gone, In my code, I will create optimizer for model and gen_model. So the error will happen. But I still don't know why, is this a bug or something else?

来源：https://stackoverflow.com/questions/43013951/tensorflowvariable-rnnlm-rnnlm-embedding-adam-2-does-not-exist

标签

python

tensorflow

nlp