可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I'm trying to make RNN using LSTM. I made LSTM model, and after it, there is two DNN network, and one regression output layer.

I trained my data, and the final training loss become about 0.009. However, when i applied the model to test data, the loss become about 0.5.

The 1th epoch training loss is about 0.5. So, I think the trained variable do not used in test model.

The only difference between training and test model is batch size. Trainning Batch = 100~200, Test Batch Size = 1.

in main function i made LSTM instance. In LSTM innitializer, the model is made.

def __init__(self,config,train_model=None):     self.sess = sess = tf.Session()      self.num_steps = num_steps = config.num_steps     self.lstm_size = lstm_size = config.lstm_size     self.num_features = num_features = config.num_features     self.num_layers = num_layers = config.num_layers     self.num_hiddens = num_hiddens = config.num_hiddens     self.batch_size = batch_size = config.batch_size     self.train = train = config.train     self.epoch = config.epoch     self.learning_rate = learning_rate = config.learning_rate      with tf.variable_scope('model') as scope:                 self.lstm_cell = lstm_cell = tf.nn.rnn_cell.LSTMCell(lstm_size,initializer = tf.contrib.layers.xavier_initializer(uniform=False))         self.cell = cell = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * num_layers)      with tf.name_scope('placeholders'):         self.x = tf.placeholder(tf.float32,[self.batch_size,num_steps,num_features],                                 name='input-x')         self.y = tf.placeholder(tf.float32, [self.batch_size,num_features],name='input-y')         self.init_state = cell.zero_state(self.batch_size,tf.float32)     with tf.variable_scope('model'):         self.W1 = tf.Variable(tf.truncated_normal([lstm_size*num_steps,num_hiddens],stddev=0.1),name='W1')         self.b1 = tf.Variable(tf.truncated_normal([num_hiddens],stddev=0.1),name='b1')         self.W2 = tf.Variable(tf.truncated_normal([num_hiddens,num_hiddens],stddev=0.1),name='W2')         self.b2 = tf.Variable(tf.truncated_normal([num_hiddens],stddev=0.1),name='b2')         self.W3 = tf.Variable(tf.truncated_normal([num_hiddens,num_features],stddev=0.1),name='W3')         self.b3 = tf.Variable(tf.truncated_normal([num_features],stddev=0.1),name='b3')       self.output, self.loss = self.inference()     tf.initialize_all_variables().run(session=sess)                     tf.initialize_variables([self.b2]).run(session=sess)      if train_model == None:         self.train_step = tf.train.GradientDescentOptimizer(self.learning_rate).minimize(self.loss)

Using Above LSTM init, below LSTM instance are made.

with tf.variable_scope("model",reuse=None):     train_model = LSTM(main_config) with tf.variable_scope("model", reuse=True):     predict_model = LSTM(predict_config)

after making two LSTM instance, I trained the train_model. And I input the test set in predict_model.

Why the variable are not reused?

回答1:

The problem is that you should be using tf.get_variable() to create your variables, instead of tf.Variable(), if you are reusing a scope.

Take a look at this tutorial for sharing variables, you'll understand it better.

Also, you don't need to use a session here, because you don't have to initialize your variables when you are defining the model, the variables should be initialized when you are about to train your model.

The code to reuse the variables is the following:

def __init__(self,config,train_model=None):     self.num_steps = num_steps = config.num_steps     self.lstm_size = lstm_size = config.lstm_size     self.num_features = num_features = config.num_features     self.num_layers = num_layers = config.num_layers     self.num_hiddens = num_hiddens = config.num_hiddens     self.batch_size = batch_size = config.batch_size     self.train = train = config.train     self.epoch = config.epoch     self.learning_rate = learning_rate = config.learning_rate      with tf.variable_scope('model') as scope:                 self.lstm_cell = lstm_cell = tf.nn.rnn_cell.LSTMCell(lstm_size,initializer = tf.contrib.layers.xavier_initializer(uniform=False))         self.cell = cell = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * num_layers)      with tf.name_scope('placeholders'):         self.x = tf.placeholder(tf.float32,[self.batch_size,num_steps,num_features],                                 name='input-x')         self.y = tf.placeholder(tf.float32, [self.batch_size,num_features],name='input-y')         self.init_state = cell.zero_state(self.batch_size,tf.float32)     with tf.variable_scope('model'):         self.W1 = tf.get_variable(initializer=tf.truncated_normal([lstm_size*num_steps,num_hiddens],stddev=0.1),name='W1')         self.b1 = tf.get_variable(initializer=tf.truncated_normal([num_hiddens],stddev=0.1),name='b1')         self.W2 = tf.get_variable(initializer=tf.truncated_normal([num_hiddens,num_hiddens],stddev=0.1),name='W2')         self.b2 = tf.get_variable(initializer=tf.truncated_normal([num_hiddens],stddev=0.1),name='b2')         self.W3 = tf.get_variable(initializer=tf.truncated_normal([num_hiddens,num_features],stddev=0.1),name='W3')         self.b3 = tf.get_variable(initializer=tf.truncated_normal([num_features],stddev=0.1),name='b3')       self.output, self.loss = self.inference()      if train_model == None:         self.train_step = tf.train.GradientDescentOptimizer(self.learning_rate).minimize(self.loss)

To see which variables are created after you create train_model and predict_model use the following code:

for v in tf.all_variables():     print(v.name)

文章来源: Reuse Reusing Variable of LSTM in Tensorflow

标签

lstm

num