问题
edit see the bottom for how I fixed this
I've written my own Keras layer, whose build method is as follows:
class Multed_Weights(Layer):
def __init__(self, input_dim, output_dim, **kwargs):
self.output_dim = output_dim
self.input_dim = input_dim
super(Multed_Weights, self).__init__(**kwargs)
def build(self, input_shape):
# Create a trainable weight variable for this layer.
self.kernel = self.add_weight(name='kernel',
shape=(self.input_dim, self.output_dim),
initializer=RandomNormal(mean=0., stddev = 0.05, seed = None),
trainable=True)
super(Multed_Weights, self).build(input_shape) # Be sure to call this somewhere!
print("mult kernel has shape " + str(K.int_shape(self.kernel)))
def call(self, x, **kwargs):
return Dot(axes = [1,0])([x, self.kernel])
def compute_output_shape(self, input_shape):
output_shape = (None, self.output_dim)
print("the output shape of multed weights is "+ str(output_shape))
return output_shape
Here's what I'm trying to do. Following https://arxiv.org/pdf/1503.08895.pdf (see just above citation (3) on page 2) I fixed my previous problem, but now I have another.
"InvalidArgumentError (see above for traceback): Incompatible shapes: [150,128] vs. [150,10000] [[Node: training/SGD/gradients/multed__weights_1/dot_2/Mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@multed__weights_1/dot_2/Mul"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](training/SGD/gradients/multed__weights_1/dot_2/Mul_grad/Shape, training/SGD/gradients/multed__weights_1/dot_2/Mul_grad/Shape_1)]]"
So it's still this class that's causing the problem. My batch size is 128, so it seems that the error is coming from this matrix of weights not having a batch size. But it won't let me create it with shape=(None, ... , ...) so I don't know what to do.
update I was too fixated on having the dynamic batch size in this layer. Once I just hard coded the shape to be (128, self.input_dim, self.output_dim), knowing ahead of time that my batch size is 128, it worked fine. Although, I realized this method is going to have different weights in each of the 128 batches, so maybe I should set shape as before and then do RepeatVector 128 times. I suspect each of the 128 will then contain references to the same weights, rather than making new ones, but I'm not sure.
来源:https://stackoverflow.com/questions/50145277/creating-a-keras-layer-of-trainable-weights