Creating a Keras layer of trainable weights

问题

edit see the bottom for how I fixed this

I've written my own Keras layer, whose build method is as follows:

class Multed_Weights(Layer):

def __init__(self, input_dim, output_dim, **kwargs):
    self.output_dim = output_dim
    self.input_dim = input_dim

    super(Multed_Weights, self).__init__(**kwargs)

def build(self, input_shape):
    # Create a trainable weight variable for this layer.
    self.kernel = self.add_weight(name='kernel',
                                  shape=(self.input_dim, self.output_dim),
                                  initializer=RandomNormal(mean=0., stddev = 0.05, seed = None),
                                  trainable=True)
    super(Multed_Weights, self).build(input_shape)  # Be sure to call this somewhere!
    print("mult kernel has shape " + str(K.int_shape(self.kernel)))

def call(self, x, **kwargs):
    return Dot(axes = [1,0])([x, self.kernel])

def compute_output_shape(self, input_shape):
    output_shape = (None, self.output_dim)
    print("the output shape of multed weights is "+ str(output_shape))
    return output_shape

Here's what I'm trying to do. Following https://arxiv.org/pdf/1503.08895.pdf (see just above citation (3) on page 2) I fixed my previous problem, but now I have another.

"InvalidArgumentError (see above for traceback): Incompatible shapes: [150,128] vs. [150,10000] [[Node: training/SGD/gradients/multed__weights_1/dot_2/Mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@multed__weights_1/dot_2/Mul"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](training/SGD/gradients/multed__weights_1/dot_2/Mul_grad/Shape, training/SGD/gradients/multed__weights_1/dot_2/Mul_grad/Shape_1)]]"

So it's still this class that's causing the problem. My batch size is 128, so it seems that the error is coming from this matrix of weights not having a batch size. But it won't let me create it with shape=(None, ... , ...) so I don't know what to do.

update I was too fixated on having the dynamic batch size in this layer. Once I just hard coded the shape to be (128, self.input_dim, self.output_dim), knowing ahead of time that my batch size is 128, it worked fine. Although, I realized this method is going to have different weights in each of the 128 batches, so maybe I should set shape as before and then do RepeatVector 128 times. I suspect each of the 128 will then contain references to the same weights, rather than making new ones, but I'm not sure.

来源：https://stackoverflow.com/questions/50145277/creating-a-keras-layer-of-trainable-weights

标签

parameters

deep-learning

keras

layer