keras cnn_lstm input layer not accepting 1-D input

I have sequences of long 1_D vectors (3000 digits) that I am trying to classify. I have previously implemented a simple CNN to classify them with relative success:

def create_shallow_model(shape,repeat_length,stride):
    model = Sequential()
    model.add(Conv1D(75,repeat_length,strides=stride,padding='same', input_shape=shape, activation='relu'))
    model.add(MaxPooling1D(repeat_length))
    model.add(Flatten())
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
    return model

However I am looking to improve the performance by stacking an LSTM/ RNN on the end of the network.

I am having difficulty with this as I cannot seem to find a way for the network to accept the data.

def cnn_lstm(shape,repeat_length,stride):
    model = Sequential()
    model.add(TimeDistributed(Conv1D(75,repeat_length,strides=stride,padding='same', activation='relu'),input_shape=(None,)+shape))
    model.add(TimeDistributed(MaxPooling1D(repeat_length)))
    model.add(TimeDistributed(Flatten()))
    model.add(LSTM(6,return_sequences=True))
    model.add(Dense(1,activation='sigmoid'))
    model.compile(loss='sparse_categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
    return model

model=cnn_lstm(X.shape[1:],1000,1)
tprs,aucs=calculate_roc(model,3,100,train_X,train_y,test_X,test_y,tprs,aucs)

But I get the following error:

ValueError: Error when checking input: expected time_distributed_4_input to have 4 dimensions, but got array with shape (50598, 3000, 1)

My questions are:

Is this a correct way of analysing this data?
If so, how do I get the network to accept and classify the input sequences?

There is no need to add those TimeDistributed wrappers. Currently, before adding the LSTM layer, your model looks like this (I have assumed repeat_length=5 and stride=1):

Layer (type)                 Output Shape              Param #   
=================================================================
conv1d_2 (Conv1D)            (None, 3000, 75)          450       
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 600, 75)           0         
_________________________________________________________________
flatten_2 (Flatten)          (None, 45000)             0         
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 45001     
=================================================================
Total params: 45,451
Trainable params: 45,451
Non-trainable params: 0
_________________________________________________________________

So if you want to add a LSTM layer, you can put it right after the MaxPooling1D layer like model.add(LSTM(16, activation='relu')) and just remove the Flatten layer. Now the model looks like this:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv1d_4 (Conv1D)            (None, 3000, 75)          450       
_________________________________________________________________
max_pooling1d_3 (MaxPooling1 (None, 600, 75)           0         
_________________________________________________________________
lstm_1 (LSTM)                (None, 16)                5888      
_________________________________________________________________
dense_5 (Dense)              (None, 1)                 17        
=================================================================
Total params: 6,355
Trainable params: 6,355
Non-trainable params: 0
_________________________________________________________________

If you want you can pass the return_sequences=True argument to the LSTM layer and keep the Flatten layer. But only do such a thing after you have tried the first approach and you have gotten poor results, since adding return_sequences=True may not be necessary at all and it only increases your model size and decreases model performance.

As a side note: why did you change the loss function to sparse_categorical_crossentropy in the second model? There is no need to do that since binary_crossentropy would work fine.

来源：https://stackoverflow.com/questions/51992336/keras-cnn-lstm-input-layer-not-accepting-1-d-input

标签

python

machine-learning

keras

conv-neural-network

lstm