Keras - Input a 3 channel image into LSTM

前端 未结 1 1996
情书的邮戳
情书的邮戳 2020-12-28 21:13

I have read a sequence of images into a numpy array with shape (7338, 225, 1024, 3) where 7338 is the sample size, 225 are the time st

相关标签:
1条回答
  • 2020-12-28 21:29

    If you want the number of images to be a sequence (like a movie with frames), you need to put pixels AND channels as features:

    input_shape = (225,3072)  #a 3D input where the batch size 7338 wasn't informed
    

    If you want more processing before throwing 3072 features into an LSTM, you can combine or interleave 2D convolutions and LSTMs for a more refined model (not necessarily better, though, each application has its particular behavior).

    You can also try to use the new ConvLSTM2D, which will take the five dimensional input:

    input_shape=(225,32,32,3) #a 5D input where the batch size 7338 wasn't informed
    

    I'd probably create a convolutional net with several TimeDistributed(Conv2D(...)) and TimeDistributed(MaxPooling2D(...)) before adding a TimeDistributed(Flatten()) and finally the LSTM(). This will very probably improve both your image understanding and the performance of the LSTM.

    0 讨论(0)
提交回复
热议问题