Keras RNN with LSTM cells for predicting multiple output time series based on multiple intput time series

蓝咒 提交于 2019-12-02 19:30:32

Initial note. If time series were short (for example T = 30), we wouldn't need stateful LSTM and classic LSTM would work well. In OP question, time series lengths are T=3000, so learning can be very slow with classic LSTM. Learning will be improved by cutting the time series into pieces and using stateful LSTM.

Stateful mode with N=batch_size. Stateful models are tricky with Keras, because you need to be careful on how you cut time series and select batch size. In OP question, sample size is N=100. Since we can accept to train model with batch of one hundred (it is not a large number), we will select batch_size=100.

Selecting batch_size=N simplifies the training because you do not need to reset states inside epochs (so no need to write a callback on_batch_begin).

It remains the question of cutting the time series. Cutting is a little technical, so I wrote a wrapper function working in all cases.

def stateful_cut(arr, batch_size, T_after_cut):
    if len(arr.shape) != 3:
        # N: Independent sample size,
        # T: Time length,
        # m: Dimension
        print("ERROR: please format arr as a (N, T, m) array.")

    N = arr.shape[0]
    T = arr.shape[1]

    # We need T_after_cut * nb_cuts = T
    nb_cuts = int(T / T_after_cut)
    if nb_cuts * T_after_cut != T:
        print("ERROR: T_after_cut must divide T")

    # We need batch_size * nb_reset = N
    # If nb_reset = 1, we only reset after the whole epoch, so no need to reset
    nb_reset = int(N / batch_size)
    if nb_reset * batch_size != N:
        print("ERROR: batch_size must divide N")

    # Cutting (technical)
    cut1 = np.split(arr, nb_reset, axis=0)
    cut2 = [np.split(x, nb_cuts, axis=1) for x in cut1]
    cut3 = [np.concatenate(x) for x in cut2]
    cut4 = np.concatenate(cut3)
    return(cut4)

From now, it become easy to train the model. Since the OP example is very simple, we do not need additional preprocessing or regularization. I describe how to proceed step by step (for the impatient, whole self-contained code is available at the very end of this post).

First we load data and reshape it with wrapper function.

import numpy as np
from keras.models import Sequential
from keras.layers import Dense, LSTM, TimeDistributed
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches

##
# Data
##
N = X_train.shape[0] # size of samples
T = X_train.shape[1] # length of each time series
batch_size = N # number of time series considered together: batch_size | N
T_after_cut = 100 # length of each cut part of the time series: T_after_cut | T
dim_in = X_train.shape[2] # dimension of input time series
dim_out = y_train.shape[2] # dimension of output time series

inputs, outputs, inputs_test, outputs_test = \
  [stateful_cut(arr, batch_size, T_after_cut) for arr in \
  [X_train, y_train, X_test, y_test]]

Then we compile a model with 4 inputs, 3 outputs, and 1 hidden layer containing 10 nodes.

##
# Model
##
nb_units = 10

model = Sequential()
model.add(LSTM(batch_input_shape=(batch_size, None, dim_in),
               return_sequences=True, units=nb_units, stateful=True))
model.add(TimeDistributed(Dense(activation='linear', units=dim_out)))
model.compile(loss = 'mse', optimizer = 'rmsprop')

We train the model without resetting states. We can do it only because we have selected batch_size = N.

##
# Training
##
epochs = 100

nb_reset = int(N / batch_size)
if nb_reset > 1:
    print("ERROR: We need to reset states when batch_size < N")

# When nb_reset = 1, we do not need to reinitialize states
history = model.fit(inputs, outputs, epochs = epochs, 
                    batch_size = batch_size, shuffle=False,
                    validation_data=(inputs_test, outputs_test))

We get evolution of training/test loss as follows:

Now, we define a 'mime model' which is stateless but containing our stateful weights. [Why like this? Prediction with stateful model through model.predict needs a complete batch in Keras, but we may not have a complete batch to predict...]

## Mime model which is stateless but containing stateful weights
model_stateless = Sequential()
model_stateless.add(LSTM(input_shape=(None, dim_in),
               return_sequences=True, units=nb_units))
model_stateless.add(TimeDistributed(Dense(activation='linear', units=dim_out)))
model_stateless.compile(loss = 'mse', optimizer = 'rmsprop')
model_stateless.set_weights(model.get_weights())

Finally, we can show our incredible predictions on our long time series y1, y2 and y3 (blue for true output ; orange for predicted outputs):

For y1:

For y2:

For y3:

Conclusion: It works almost perfectly, unless for the 2-3 first dates where the series is unpredictable by definition. We do not observe any burst when going from one batch for the next batch.

Much more When N is large, we would like to select batch_size | N with batch_size < N. I have written full code in https://github.com/ahstat/deep-learning/blob/master/rnn/4_lagging_and_stateful.py (Part C and D). This github path also shows efficiency of classic LSTM for short time series (Part A), and inefficiency for long time series (Part B). I've written a blog post detailing how to use Keras for time series predictions here: https://ahstat.github.io/RNN-Keras-time-series/ .

Complete self-contained code

################
# Code from OP #
################
import numpy as np
def random_sample(len_timeseries=3000):
    Nchoice = 600
    x1 = np.cos(np.arange(0,len_timeseries)/float(1.0 + np.random.choice(Nchoice)))
    x2 = np.cos(np.arange(0,len_timeseries)/float(1.0 + np.random.choice(Nchoice)))
    x3 = np.sin(np.arange(0,len_timeseries)/float(1.0 + np.random.choice(Nchoice)))
    x4 = np.sin(np.arange(0,len_timeseries)/float(1.0 + np.random.choice(Nchoice)))
    y1 = np.random.random(len_timeseries)
    y2 = np.random.random(len_timeseries)
    y3 = np.random.random(len_timeseries)
    for t in range(3,len_timeseries):
        ## the output time series depend on input as follows: 
        y1[t] = x1[t-2] 
        y2[t] = x2[t-1]*x3[t-2]
        y3[t] = x4[t-3]
    y = np.array([y1,y2,y3]).T
    X = np.array([x1,x2,x3,x4]).T
    return y, X
def generate_data(Nsequence = 1000):
    X_train = []
    y_train = []
    for isequence in range(Nsequence):
        y, X = random_sample()
        X_train.append(X)
        y_train.append(y)
    return np.array(X_train),np.array(y_train)

Nsequence = 100
prop = 0.5
Ntrain = int(Nsequence*prop)
X, y = generate_data(Nsequence)
X_train = X[:Ntrain,:,:]
X_test  = X[Ntrain:,:,:]
y_train = y[:Ntrain,:,:]
y_test  = y[Ntrain:,:,:] 

#X.shape = (N sequence, length of time series, N input features)
#y.shape = (N sequence, length of time series, N targets)
print(X.shape, y.shape)
# (100, 3000, 4) (100, 3000, 3)

####################
# Cutting function #
####################
def stateful_cut(arr, batch_size, T_after_cut):
    if len(arr.shape) != 3:
        # N: Independent sample size,
        # T: Time length,
        # m: Dimension
        print("ERROR: please format arr as a (N, T, m) array.")

    N = arr.shape[0]
    T = arr.shape[1]

    # We need T_after_cut * nb_cuts = T
    nb_cuts = int(T / T_after_cut)
    if nb_cuts * T_after_cut != T:
        print("ERROR: T_after_cut must divide T")

    # We need batch_size * nb_reset = N
    # If nb_reset = 1, we only reset after the whole epoch, so no need to reset
    nb_reset = int(N / batch_size)
    if nb_reset * batch_size != N:
        print("ERROR: batch_size must divide N")

    # Cutting (technical)
    cut1 = np.split(arr, nb_reset, axis=0)
    cut2 = [np.split(x, nb_cuts, axis=1) for x in cut1]
    cut3 = [np.concatenate(x) for x in cut2]
    cut4 = np.concatenate(cut3)
    return(cut4)

#############
# Main code #
#############
from keras.models import Sequential
from keras.layers import Dense, LSTM, TimeDistributed
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches

##
# Data
##
N = X_train.shape[0] # size of samples
T = X_train.shape[1] # length of each time series
batch_size = N # number of time series considered together: batch_size | N
T_after_cut = 100 # length of each cut part of the time series: T_after_cut | T
dim_in = X_train.shape[2] # dimension of input time series
dim_out = y_train.shape[2] # dimension of output time series

inputs, outputs, inputs_test, outputs_test = \
  [stateful_cut(arr, batch_size, T_after_cut) for arr in \
  [X_train, y_train, X_test, y_test]]

##
# Model
##
nb_units = 10

model = Sequential()
model.add(LSTM(batch_input_shape=(batch_size, None, dim_in),
               return_sequences=True, units=nb_units, stateful=True))
model.add(TimeDistributed(Dense(activation='linear', units=dim_out)))
model.compile(loss = 'mse', optimizer = 'rmsprop')

##
# Training
##
epochs = 100

nb_reset = int(N / batch_size)
if nb_reset > 1:
    print("ERROR: We need to reset states when batch_size < N")

# When nb_reset = 1, we do not need to reinitialize states
history = model.fit(inputs, outputs, epochs = epochs, 
                    batch_size = batch_size, shuffle=False,
                    validation_data=(inputs_test, outputs_test))

def plotting(history):
    plt.plot(history.history['loss'], color = "red")
    plt.plot(history.history['val_loss'], color = "blue")
    red_patch = mpatches.Patch(color='red', label='Training')
    blue_patch = mpatches.Patch(color='blue', label='Test')
    plt.legend(handles=[red_patch, blue_patch])
    plt.xlabel('Epochs')
    plt.ylabel('MSE loss')
    plt.show()

plt.figure(figsize=(10,8))
plotting(history) # Evolution of training/test loss

##
# Visual checking for a time series
##
## Mime model which is stateless but containing stateful weights
model_stateless = Sequential()
model_stateless.add(LSTM(input_shape=(None, dim_in),
               return_sequences=True, units=nb_units))
model_stateless.add(TimeDistributed(Dense(activation='linear', units=dim_out)))
model_stateless.compile(loss = 'mse', optimizer = 'rmsprop')
model_stateless.set_weights(model.get_weights())

## Prediction of a new set
i = 0 # time series selected (between 0 and N-1)
x = X_train[i]
y = y_train[i]
y_hat = model_stateless.predict(np.array([x]))[0]

for dim in range(3): # dim = 0 for y1 ; dim = 1 for y2 ; dim = 2 for y3.
    plt.figure(figsize=(10,8))
    plt.plot(range(T), y[:,dim])
    plt.plot(range(T), y_hat[:,dim])
    plt.show()

## Conclusion: works almost perfectly.
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!