问题
I'm trying to write a custom Keras layer with trainable weights in R which:
- Takes an input vector x and returns the value
exp(A * X*A)where $A$ is diagonal and trainable..
Where exp is the matrix exponential map.
回答1:
Notice that it's very important that you understand where your batch size is, and that a layer CANNOT have weights with sizes based on the batch size (unless you define your inputs with batch_shape or batch_input_shape instead of shape -- this will force you to use a fixed batch size in the model). Since the batch size is usually for "individual" and "independent" samples, it's not healthy to use the batch size in operations and mixing samples!
That said, I am assuming that X here has shape (batch, dim, dim), and that A will have shape (dim, dim) consequently.
For this, you build a custom layer such as here: https://tensorflow.rstudio.com/guide/keras/custom_layers/
Where build will have kernel (A) with shape (1, dim, 1) --
build = function(input_shape) {
self$kernel <- self$add_weight(
name = 'kernel',
shape = list(1,input_shape[[2]], 1),
initializer = initializer_random_normal(), #you may choose different initializers
trainable = TRUE
)
},
And call will use a mathematical trick to simulate the diagonal.
Notice that if A is diagonal, the result of A x X x A will be B*X (elementwise), where B is:
#supposing A has the elements [a, b, c, ...] in the diagonals,
B is:
[ [aa, ab, ac, ...],
[ab, bb, bc, ...],
[ac, bc, cc, ...],
...
]
Because of this, we will not use diagonals, but a broadcasting trick with elementwise multiplication:
call = function(x, mask = NULL) {
kernelTransposed <- tf$reshape(self$kernel, shape(1L, 1L, -1L)) #(1, 1, dim)
B <- self$kernel * kernelTransposed #(1, dim, dim)
tf$math$exp(x * B)
},
The output shape goes unchanged:
compute_output_shape = function(input_shape) {
input_shape
}
来源:https://stackoverflow.com/questions/60281789/r-custom-keras-layer-with-weight-constraints