I\'m using Keras 1.0. My problem is identical to this one (How to implement a Mean Pooling layer in Keras), but the answer there does not seem to be sufficient for me.
Adding TimeDistributed(Dense(1)) helped:
sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True)(embedded)
distributed = TimeDistributed(Dense(1))(lstm)
pool = AveragePooling1D()(distributed)
output = Dense(1, activation='sigmoid')(pool)