Stock price predictions of keras multilayer LSTM model converge to a constant value

问题

I've made a multilayer LSTM model that uses regression to predict next frame's values of the data. The model finishes after 20 epochs. I then get some predictions and compare them to my ground truth values. As you can see them in the picture above, predictions converge to a constant value. I don't know why this happens. Here is my model so far:

from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.layers import LSTM, BatchNormalization
from tensorflow.python.keras.initializers import RandomUniform

init = RandomUniform(minval=-0.05, maxval= 0.05)

model = Sequential()

model.add(LSTM(kernel_initializer=init, activation='relu', return_sequences=True, units=800, dropout=0.5, recurrent_dropout=0.2, input_shape=(x_train.shape[1], x_train.shape[2]) ))
model.add(LSTM(kernel_initializer=init, activation='relu', return_sequences=False, units=500, dropout=0.5, recurrent_dropout=0.2 ))

model.add(Dense(1024, activation='linear', kernel_initializer=init))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(1, activation='linear', kernel_initializer= 'normal'))

model.compile(loss='mean_squared_error', optimizer='rmsprop' )
model.summary()

EDIT1: I decreased epochs from 20 to 3. results are as follows:

By comparing 2 pictures, I can conclude that when the number of epochs increases, the predictions are more likely to converge to some specific value which is around -0.1.

回答1:

So, after trying different number of LSTM units and different types of architectures, I realized that the current number of LSTM units causes the model to learns so slowly and 20 epochs were not sufficient for such huge model.For each layer, I changed the number of LSTM units to 64 and also removed Dense(1024)layer and increased the number of epochs from 20 to 400 and results were incredibly close to the ground truth values. I should mention that the dataset used in the new model was different from the former one because I encountered some problems with that dataset . here is the new model:

from keras.optimizers import RMSprop
from keras.initializers import glorot_uniform, glorot_normal, RandomUniform

init = glorot_normal(seed=None)
init1 = RandomUniform(minval=-0.05, maxval=0.05)
optimizer = RMSprop(lr=0.001, rho=0.9, epsilon=None, decay=0.0)

model = Sequential()

model.add(LSTM(units=64, dropout=0.2, recurrent_dropout=0.2, 
               input_shape=(x_train.shape[1], x_train.shape[2]), 
               return_sequences=True, kernel_initializer=init))

model.add(LSTM(units=64, dropout=0.2, recurrent_dropout=0.2, 
               return_sequences=False, kernel_initializer=init))


model.add(Dense(1, activation='linear', kernel_initializer= init1))
model.compile(loss='mean_squared_error', optimizer=optimizer )
model.summary()

you can see the predictions here:

It's still not the best model, but at least outperformed the former one. If you have any further recommendation on how to improve it, it'll be greatly appreciated.

来源：https://stackoverflow.com/questions/51618251/stock-price-predictions-of-keras-multilayer-lstm-model-converge-to-a-constant-va

标签

python

tensorflow

keras

regression

lstm