What does concatenate layers do in Keras multitask?

折月煮酒 提交于 2020-01-24 01:16:06

问题


I am implementing a simple multitask model in Keras. I used the code given in the documentation under the heading of shared layers.

I know that in multitask learning, we share some of the initial layers in our model and the final layers are made individual to the specific tasks as per the link.

I have following two cases in keras API where in the first, I am using keras.layers.concatenate while in the other, I am not using any keras.layers.concatenate.

I am posting the codes as well as the models for each case as follows.

Case-1 code

import keras
from keras.layers import Input, LSTM, Dense
from keras.models import Model
from keras.models import Sequential
from keras.layers import Dense
from keras.utils.vis_utils import plot_model

tweet_a = Input(shape=(280, 256))
tweet_b = Input(shape=(280, 256))

# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)

# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)

# We can then concatenate the two vectors:
merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)

# And add a logistic regression on top
predictions1 = Dense(1, activation='sigmoid')(merged_vector)
predictions2 = Dense(1, activation='sigmoid')(merged_vector)

# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=[predictions1, predictions2])

model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])

Case-1 Model

Case-2 code

import keras
from keras.layers import Input, LSTM, Dense
from keras.models import Model
from keras.models import Sequential
from keras.layers import Dense
from keras.utils.vis_utils import plot_model

tweet_a = Input(shape=(280, 256))
tweet_b = Input(shape=(280, 256))

# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)

# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)



# And add a logistic regression on top
predictions1 = Dense(1, activation='sigmoid')(encoded_a )
predictions2 = Dense(1, activation='sigmoid')(encoded_b)

# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=[predictions1, predictions2])

model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])

Case-2 Model

In both cases, the LSTMlayer is shared only. In case-1, we have keras.layers.concatenate but in case-2, we don't have any keras.layers.concatenate.

My question is, which one is multitasking, case-1 or case-2? Morover, what is the function of keras.layers.concatenate in case-1?


回答1:


Both are multi-task models, as this only depends if there are multiple outputs with one task associated to each output.

The difference is that your first model explicitly concatenates features produced by the shared layer, so both output tasks can consider information from both inputs. The second model only has connections from one input directly to one of the outputs, without considering the other input. The only link between models here is that they share the LSTM weights.



来源:https://stackoverflow.com/questions/57197895/what-does-concatenate-layers-do-in-keras-multitask

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!