LSTM Followed by Mean Pooling

匿名 (未验证) 提交于 2019-12-03 02:20:02

问题:

I'm using Keras 1.0. My problem is identical to this one (How to implement a Mean Pooling layer in Keras), but the answer there does not seem to be sufficient for me.

I want to implement this network:

The following code does not work:

sequence = Input(shape=(max_sent_len,), dtype='int32') embedded = Embedding(vocab_size, word_embedding_size)(sequence) lstm = LSTM(hidden_state_size, activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True)(embedded) pool = AveragePooling1D()(lstm) output = Dense(1, activation='sigmoid')(pool) 

If I don't set return_sequences=True, I get this error when I call AveragePooling1D():

Traceback (most recent call last):   File "<stdin>", line 1, in <module>   File "/PATH/keras/engine/topology.py", line 462, in __call__     self.assert_input_compatibility(x)   File "/PATH/keras/engine/topology.py", line 382, in assert_input_compatibility     str(K.ndim(x))) Exception: ('Input 0 is incompatible with layer averagepooling1d_6: expected ndim=3', ' found ndim=2') 

Otherwise, I get this error when I call Dense():

Traceback (most recent call last):   File "<stdin>", line 1, in <module>   File "/PATH/keras/engine/topology.py", line 456, in __call__     self.build(input_shapes[0])   File "/fs/clip-arqat/mossaab/trec/liveqa/cmu/venv/lib/python2.7/site-packages/keras/layers/core.py", line 512, in build     assert len(input_shape) == 2 AssertionError 

回答1:

Please try this (I hope this will solve your problem :) ):

http://keras.io/layers/core/#timedistributedmerge



回答2:

I think the accepted answer is basically wrong. A solution was found at: https://github.com/fchollet/keras/issues/2151 However, it only works for theano backend. I have modified the code so that it supports both theano and tensorflow.

from keras.engine.topology import Layer, InputSpec from keras import backend as T  class TemporalMeanPooling(Layer):     """ This is a custom Keras layer. This pooling layer accepts the temporal sequence output by a recurrent layer and performs temporal pooling, looking at only the non-masked portion of the sequence. The pooling layer converts the entire variable-length hidden vector sequence into a single hidden vector, and then feeds its output to the Dense layer.  input shape: (nb_samples, nb_timesteps, nb_features) output shape: (nb_samples, nb_features) """ def __init__(self, **kwargs):     super(TemporalMeanPooling, self).__init__(**kwargs)     self.supports_masking = True     self.input_spec = [InputSpec(ndim=3)]  def get_output_shape_for(self, input_shape):     return (input_shape[0], input_shape[2])  def call(self, x, mask=None): #mask: (nb_samples, nb_timesteps)     if mask is None:         mask = T.mean(T.ones_like(x), axis=-1)     ssum = T.sum(x,axis=-2) #(nb_samples, np_features)     mask = T.cast(mask,T.floatx())     rcnt = T.sum(mask,axis=-1,keepdims=True) #(nb_samples)     return ssum/rcnt     #return rcnt  def compute_mask(self, input, mask):     return None 


回答3:

I just attempted to implement the same model as the original poster, and I'm using Keras 2.0.3. The mean pooling after LSTM worked when I used GlobalAveragePooling1D, just make sure return_sequences=True in the LSTM layer. Give it a try!



回答4:

Adding TimeDistributed(Dense(1)) helped:

sequence = Input(shape=(max_sent_len,), dtype='int32') embedded = Embedding(vocab_size, word_embedding_size)(sequence) lstm = LSTM(hidden_state_size, activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True)(embedded) distributed = TimeDistributed(Dense(1))(lstm) pool = AveragePooling1D()(distributed) output = Dense(1, activation='sigmoid')(pool) 


回答5:

Thanks, I also meet the question, but I think TimeDistributed layer not working as you want, you can try Luke Guye's TemporalMeanPooling layer, it works for me. Here is the example:

sequence = Input(shape=(max_sent_len,), dtype='int32') embedded = Embedding(vocab_size, word_embedding_size)(sequence) lstm = LSTM(hidden_state_size, return_sequences=True)(embedded) pool = TemporalMeanPooling()(lstm) output = Dense(1, activation='sigmoid')(pool) 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!