Understanding Keras LSTMs: Role of Batch-size and Statefulness

前端 未结 1 577
孤独总比滥情好
孤独总比滥情好 2020-12-12 20:01

Sources

There are several sources out there explaining stateful / stateless LSTMs and the role of batch_size which I\'ve read already. I\'ll refer to them later in

相关标签:
1条回答
  • 2020-12-12 20:25

    Let me explain it via an example:

    So let's say you have the following series: 1,2,3,4,5,6,...,100. You have to decide how many timesteps your lstm will learn, and reshape your data as so. Like below:

    if you decide time_steps = 5, you have to reshape your time series as a matrix of samples in this way:

    1,2,3,4,5 -> sample1

    2,3,4,5,6 -> sample2

    3,4,5,6,7 -> sample3

    etc...

    By doing so, you will end with a matrix of shape (96 samples x 5 timesteps)

    This matrix should be reshape as (96 x 5 x 1) indicating Keras that you have just 1 time series. If you have more time series in parallel (as in your case), you do the same operation on each time series, so you will end with n matrices (one for each time series) each of shape (96 sample x 5 timesteps).

    For the sake of argument, let's say you 3 time series. You should concat all of three matrices into one single tensor of shape (96 samples x 5 timeSteps x 3 timeSeries). The first layer of your lstm for this example would be:

        model = Sequential()
        model.add(LSTM(32, input_shape=(5, 3)))
    

    The 32 as first parameter is totally up to you. It means that at each point in time, your 3 time series will become 32 different variables as output space. It is easier to think each time step as a fully conected layer with 3 inputs and 32 outputs but with a different computation than FC layers.

    If you are about stacking multiple lstm layers, use return_sequences=True parameter, so the layer will output the whole predicted sequence rather than just the last value.

    your target shoud be the next value in the series you want to predict.

    Putting all together, let say you have the following time series:

    Time series 1 (master): 1,2,3,4,5,6,..., 100

    Time series 2 (support): 2,4,6,8,10,12,..., 200

    Time series 3 (support): 3,6,9,12,15,18,..., 300

    Create the input and target tensor

    x     -> y
    

    1,2,3,4,5 -> 6

    2,3,4,5,6 -> 7

    3,4,5,6,7 -> 8

    reformat the rest of time series, but forget about the target since you don't want to predict those series

    Create your model

        model = Sequential()
        model.add(LSTM(32, input_shape=(5, 3), return_sequences=True)) # Input is shape (5 timesteps x 3 timeseries), output is shape (5 timesteps x 32 variables) because return_sequences  = True
        model.add(LSTM(8))  # output is shape (1 timesteps x 8 variables) because return_sequences = False
        model.add(Dense(1, activation='linear')) # output is (1 timestep x 1 output unit on dense layer). It is compare to target variable.
    

    Compile it and train. A good batch size is 32. Batch size is the size your sample matrices are splited for faster computation. Just don't use statefull

    0 讨论(0)
提交回复
热议问题