问题
I have sequences of long 1_D vectors (3000 digits) that I am trying to classify. I have previously implemented a simple CNN to classify them with relative success:
def create_shallow_model(shape,repeat_length,stride):
model = Sequential()
model.add(Conv1D(75,repeat_length,strides=stride,padding='same', input_shape=shape, activation='relu'))
model.add(MaxPooling1D(repeat_length))
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
return model
However I am looking to improve the performance by stacking an LSTM/ RNN on the end of the network.
I am having difficulty with this as I cannot seem to find a way for the network to accept the data.
def cnn_lstm(shape,repeat_length,stride):
model = Sequential()
model.add(TimeDistributed(Conv1D(75,repeat_length,strides=stride,padding='same', activation='relu'),input_shape=(None,)+shape))
model.add(TimeDistributed(MaxPooling1D(repeat_length)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(6,return_sequences=True))
model.add(Dense(1,activation='sigmoid'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
return model
model=cnn_lstm(X.shape[1:],1000,1)
tprs,aucs=calculate_roc(model,3,100,train_X,train_y,test_X,test_y,tprs,aucs)
But I get the following error:
ValueError: Error when checking input: expected time_distributed_4_input to have 4 dimensions, but got array with shape (50598, 3000, 1)
My questions are:
Is this a correct way of analysing this data?
If so, how do I get the network to accept and classify the input sequences?
回答1:
There is no need to add those TimeDistributed
wrappers. Currently, before adding the LSTM layer, your model looks like this (I have assumed repeat_length=5
and stride=1
):
Layer (type) Output Shape Param #
=================================================================
conv1d_2 (Conv1D) (None, 3000, 75) 450
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 600, 75) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 45000) 0
_________________________________________________________________
dense_4 (Dense) (None, 1) 45001
=================================================================
Total params: 45,451
Trainable params: 45,451
Non-trainable params: 0
_________________________________________________________________
So if you want to add a LSTM layer, you can put it right after the MaxPooling1D
layer like model.add(LSTM(16, activation='relu'))
and just remove the Flatten
layer. Now the model looks like this:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d_4 (Conv1D) (None, 3000, 75) 450
_________________________________________________________________
max_pooling1d_3 (MaxPooling1 (None, 600, 75) 0
_________________________________________________________________
lstm_1 (LSTM) (None, 16) 5888
_________________________________________________________________
dense_5 (Dense) (None, 1) 17
=================================================================
Total params: 6,355
Trainable params: 6,355
Non-trainable params: 0
_________________________________________________________________
If you want you can pass the return_sequences=True
argument to the LSTM
layer and keep the Flatten
layer. But only do such a thing after you have tried the first approach and you have gotten poor results, since adding return_sequences=True
may not be necessary at all and it only increases your model size and decreases model performance.
As a side note: why did you change the loss function to sparse_categorical_crossentropy
in the second model? There is no need to do that since binary_crossentropy
would work fine.
来源:https://stackoverflow.com/questions/51992336/keras-cnn-lstm-input-layer-not-accepting-1-d-input