问题
Trying to build a single output regression model, but there seems to be problem in the last layer
inputs = Input(shape=(48, 1))
lstm = CuDNNLSTM(256,return_sequences=True)(inputs)
lstm = Dropout(dropouts[0])(lstm)
#aux_input
auxiliary_inputs = Input(shape=(48, 7))
auxiliary_outputs = TimeDistributed(Dense(4))(auxiliary_inputs)
auxiliary_outputs = TimeDistributed(Dense(7))(auxiliary_outputs)
#concatenate
output = keras.layers.concatenate([lstm, auxiliary_outputs])
output = TimeDistributed(Dense(64, activation='linear'))(output)
output = TimeDistributed(Dense(64, activation='linear'))(output)
output = TimeDistributed(Dense(1, activation='linear'))(output)
model = Model(inputs=[inputs, auxiliary_inputs], outputs=[output])
I am new to keras... I am getting the following error
ValueError: Error when checking target: expected time_distributed_5 to have 3 dimensions, but got array with shape (14724, 1)
回答1:
Okay guys, think I found a fix According to - https://keras.io/layers/wrappers/ it says that we are applying dense layer to each timestep (in my case I have 48 timesteps). So, the output of my final layer would be (batch_size, timesteps, dimensions) for below:
output = TimeDistributed(Dense(1, activation='linear'))(output)
will be (?,48,1) hence the dimensions mismatch. However, If I want to convert this to single regression output we will have to flatten the final TimeDistributed layer
so I added the following lines to fix it:
output = Flatten()(output)
output = (Dense(1, activation='linear'))(output)
so now the timedistributed layer flattens to 49 inputs(looks like a bias input is included) to the final dense layer into a single output.
Okay, the code works fine and I am getting proper results(the model learns). My only doubt is if it is mathematically okay to flatten TimeDistributed layer to simple dense layer to get my result like stated above?
回答2:
Can you provide a more on the context of your problem? Test data or at least more code. Why are you choosing this architecture in the first place? Would a simpler architecture (just the LSTM) do the trick? What are you regressing? Stacking multiple TimeDistributed Dense layers with linear activation functions probably isn't adding much to the model.
来源:https://stackoverflow.com/questions/46782596/error-when-checking-target-expected-time-distributed-5-to-have-3-dimensions-bu