Keras: Use the same layer in different models (share weights)

匿名 (未验证) 提交于 2019-12-03 01:48:02

问题:

Quick answer:

This is in fact really easy. Here's the code (for those who don't want to read all that text):

inputs=Input((784,)) encode=Dense(10, input_shape=[784])(inputs) decode=Dense(784, input_shape=[10])  model=Model(input=inputs, output=decode(encode))  inputs_2=Input((10,)) decode_model=Model(input=inputs_2, output=decode(inputs_2)) 

In this setup, the decode_model will use the same decode layer as the model. If you train the model, the decode_model will be trained, too.

Actual question:

I'm trying to create a simple autoencoder for MNIST in Keras:

This is the code so far:

model=Sequential() encode=Dense(10, input_shape=[784]) decode=Dense(784, input_shape=[10])  model.add(encode) model.add(decode)   model.compile(loss="mse",              optimizer="adadelta",              metrics=["accuracy"])  decode_model=Sequential() decode_model.add(decode) 

I'm training it to learn the identity function

model.fit(X_train,X_train,batch_size=50, nb_epoch=10, verbose=1,            validation_data=[X_test, X_test]) 

The reconstruction is quite interesting:

But I would also like to look at the representations of cluster. What is the output of passing [1,0...0] to the decoding layer ? This should be the "cluster-mean" of one class in MNIST.

In order to do that I created a second model decode_model, which reuses the decoder layer. But if I try to use that model, it complains:

Exception: Error when checking : expected dense_input_5 to have shape (None, 784) but got array with shape (10, 10)

That seemed strange. It's simply a dense layer, the Matrix wouldn't even be able to process 784-dim input. I decided to look at the model summary:

____________________________________________________________________________________________________ Layer (type)                     Output Shape          Param #     Connected to                      ==================================================================================================== dense_14 (Dense)                 (None, 784)           8624        dense_13[0][0]                    ==================================================================================================== Total params: 8624 

It is connected to dense_13. It's difficult to keep track of the names of the layers, but that looks like the encoder layer. Sure enough, the model summary of the whole model is:

____________________________________________________________________________________________________ Layer (type)                     Output Shape          Param #     Connected to                      ==================================================================================================== dense_13 (Dense)                 (None, 10)            7850        dense_input_6[0][0]               ____________________________________________________________________________________________________ dense_14 (Dense)                 (None, 784)           8624        dense_13[0][0]                    ==================================================================================================== Total params: 16474 ____________________ 

Apparently the layers are permanently connected. Strangely there is no input layer in my decode_model.

How can I reuse a layer in Keras ? I've looked at the functional API, but there too, layers are fused together.

回答1:

Oh, nevermind.

I should have read the entire functional API: https://keras.io/getting-started/functional-api-guide/#shared-layers

Here's one of the predictions (maybe still lacking some training):

I'm guessing this could be a 3 ? Well at least it works now.

And for those with similar problems, here's the updated code:

inputs=Input((784,)) encode=Dense(10, input_shape=[784])(inputs) decode=Dense(784, input_shape=[10])  model=Model(input=inputs, output=decode(encode))   model.compile(loss="mse",              optimizer="adadelta",              metrics=["accuracy"])  inputs_2=Input((10,)) decode_model=Model(input=inputs_2, output=decode(inputs_2)) 

I only compiled one of the models. For training you need to compile a model, for prediction that is not necessary.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!