问题
I'm trying to figure out the proper syntax for the model I'm trying to fit. It's a time-series prediction problem, and I want to use a few dense layers to improve the representation of the time series before I feed it to the LSTM.
Here's a dummy series that I'm working with:
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
import numpy as np
import keras as K
import tensorflow as tf
d = pd.DataFrame(data = {"x": np.linspace(0, 100, 1000)})
d['l1_x'] = d.x.shift(1)
d['l2_x'] = d.x.shift(2)
d.fillna(0, inplace = True)
d["y"] = np.sin(.1*d.x*np.sin(d.l1_x))*np.sin(d.l2_x)
plt.plot(d.x, d.y)
First, I'll fit a LSTM with no dense layers preceeding it. This requires that I reshape the data:
X = d[["x", "l1_x", "l2_x"]].values.reshape(len(d), 3,1)
y = d.y.values
Is this correct?
The tutorials make it seem like a single time series should have 1 in the first dimension, followed by the number of time steps (1000), followed by the number of covariates (3). But when I do that the model doesn't compile.
Here I compile and train the model:
model = K.Sequential()
model.add(K.layers.LSTM(10, input_shape=(X.shape[1], X.shape[2]), batch_size = 1, stateful=True))
model.add(K.layers.Dense(1))
callbacks = [K.callbacks.EarlyStopping(monitor='loss', min_delta=0, patience=5, verbose=1, mode='auto', baseline=None, restore_best_weights=True)]
model.compile(loss='mean_squared_error', optimizer='rmsprop')
model.fit(X, y, epochs=50, batch_size=1, verbose=1, shuffle=False, callbacks = callbacks)
model.reset_states()
yhat = model.predict(X, 1)
plt.clf()
plt.plot(d.x, d.y)
plt.plot(d.x, yhat)
How come I can't get the model to overfit?? Is it because I've reshaped my data wrong? It doesn't really get more over-fittey when I use more nodes in the LSTM.
(I'm also not clear on what it means to be "stateful". Neural networks are just nonlinear models. Which parameters are the "states" referring to and why would one want to reset them?)
How do I interpose dense layers between the input and the LSTM?
Finally, I'd like to add a bunch of dense layers, to basically do a basis expansion on x
before it gets to the LSTM. But an LSTM wants a 3D array and a dense layer spits out a matrix. What do I do here? This doesn't work:
model = K.Sequential()
model.add(K.layers.Dense(10, activation = "relu", input_dim = 3))
model.add(K.layers.LSTM(3, input_shape=(10, X.shape[2]), batch_size = 1, stateful=True))
model.add(K.layers.Dense(1))
ValueError: Input 0 is incompatible with layer lstm_2: expected ndim=3, found ndim=2
回答1:
For first question, i am doing same thing, i didn't get any error, please share your error.
Note: I will give you example using functional API, which gives little more freedom(personal opinion)
from keras.layers import Dense, Flatten, LSTM, Activation
from keras.layers import Dropout, RepeatVector, TimeDistributed
from keras import Input, Model
seq_length = 15
input_dims = 10
output_dims = 8
n_hidden = 10
model1_inputs = Input(shape=(seq_length,input_dims,))
model1_outputs = Input(shape=(output_dims,))
net1 = LSTM(n_hidden, return_sequences=True)(model1_inputs)
net1 = LSTM(n_hidden, return_sequences=False)(net1)
net1 = Dense(output_dims, activation='relu')(net1)
model1_outputs = net1
model1 = Model(inputs=model1_inputs, outputs = model1_outputs, name='model1')
## Fit the model
model1.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_11 (InputLayer) (None, 15, 10) 0
_________________________________________________________________
lstm_8 (LSTM) (None, 15, 10) 840
_________________________________________________________________
lstm_9 (LSTM) (None, 10) 840
_________________________________________________________________
dense_9 (Dense) (None, 8) 88
_________________________________________________________________
To your second problem, there are two method:
- If you are sending data without making sequence, which is of dims as
(batch, input_dims)
, then use can use this method RepeatVector, which repeat the same weights byn_steps
, which is nothing butrolling_steps
in LSTM.
{
seq_length = 15
input_dims = 16
output_dims = 8
n_hidden = 20
lstm_dims = 10
model1_inputs = Input(shape=(input_dims,))
model1_outputs = Input(shape=(output_dims,))
net1 = Dense(n_hidden)(model1_inputs)
net1 = Dense(n_hidden)(net1)
net1 = RepeatVector(3)(net1)
net1 = LSTM(lstm_dims, return_sequences=True)(net1)
net1 = LSTM(lstm_dims, return_sequences=False)(net1)
net1 = Dense(output_dims, activation='relu')(net1)
model1_outputs = net1
model1 = Model(inputs=model1_inputs, outputs = model1_outputs, name='model1')
## Fit the model
model1.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_13 (InputLayer) (None, 16) 0
_________________________________________________________________
dense_13 (Dense) (None, 20) 340
_________________________________________________________________
dense_14 (Dense) (None, 20) 420
_________________________________________________________________
repeat_vector_2 (RepeatVecto (None, 3, 20) 0
_________________________________________________________________
lstm_14 (LSTM) (None, 3, 10) 1240
_________________________________________________________________
lstm_15 (LSTM) (None, 10) 840
_________________________________________________________________
dense_15 (Dense) (None, 8) 88
=================================================================
- If you are sending the sequence of dims
(seq_len, input_dims)
, then you can TimeDistributed, which repeats the same weights of dense layer on the entire sequence.
{
seq_length = 15
input_dims = 10
output_dims = 8
n_hidden = 10
lstm_dims = 6
model1_inputs = Input(shape=(seq_length,input_dims,))
model1_outputs = Input(shape=(output_dims,))
net1 = TimeDistributed(Dense(n_hidden))(model1_inputs)
net1 = LSTM(output_dims, return_sequences=True)(net1)
net1 = LSTM(output_dims, return_sequences=False)(net1)
net1 = Dense(output_dims, activation='relu')(net1)
model1_outputs = net1
model1 = Model(inputs=model1_inputs, outputs = model1_outputs, name='model1')
## Fit the model
model1.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_17 (InputLayer) (None, 15, 10) 0
_________________________________________________________________
time_distributed_3 (TimeDist (None, 15, 10) 110
_________________________________________________________________
lstm_18 (LSTM) (None, 15, 8) 608
_________________________________________________________________
lstm_19 (LSTM) (None, 8) 544
_________________________________________________________________
dense_19 (Dense) (None, 8) 72
=================================================================
Note: I stacked two layer, on doing so, in the first layer i used return_sequence
, which return the output at each time step, which is used by second layer, where it is return output only at last time_step
.
来源:https://stackoverflow.com/questions/54138205/shaping-data-for-lstm-and-feeding-output-of-dense-layers-to-lstm