What is an Embedding in Keras?

前端 未结 5 1043
日久生厌
日久生厌 2020-12-22 23:09

Keras documentation isn\'t clear what this actually is. I understand we can use this to compress the input feature space into a smaller one. But how is this done from a neur

5条回答
  •  青春惊慌失措
    2020-12-23 00:11

    The Keras Embedding layer is not performing any matrix multiplication but it only:

    1. creates a weight matrix of (vocabulary_size)x(embedding_dimension) dimensions

    2. indexes this weight matrix


    It is always useful to have a look at the source code to understand what a class does. In this case, we will have a look at the class Embedding which inherits from the base layer class called Layer.

    (1) - Creating a weight matrix of (vocabulary_size)x(embedding_dimension) dimensions:

    This is occuring at the build function of Embedding:

    def build(self, input_shape):
        self.embeddings = self.add_weight(
            shape=(self.input_dim, self.output_dim),
            initializer=self.embeddings_initializer,
            name='embeddings',
            regularizer=self.embeddings_regularizer,
            constraint=self.embeddings_constraint,
            dtype=self.dtype)
        self.built = True
    

    If you have a look at the base class Layer you will see that the function add_weight above simply creates a matrix of trainable weights (in this case of (vocabulary_size)x(embedding_dimension) dimensions):

    def add_weight(self,
                   name,
                   shape,
                   dtype=None,
                   initializer=None,
                   regularizer=None,
                   trainable=True,
                   constraint=None):
        """Adds a weight variable to the layer.
        # Arguments
            name: String, the name for the weight variable.
            shape: The shape tuple of the weight.
            dtype: The dtype of the weight.
            initializer: An Initializer instance (callable).
            regularizer: An optional Regularizer instance.
            trainable: A boolean, whether the weight should
                be trained via backprop or not (assuming
                that the layer itself is also trainable).
            constraint: An optional Constraint instance.
        # Returns
            The created weight variable.
        """
        initializer = initializers.get(initializer)
        if dtype is None:
            dtype = K.floatx()
        weight = K.variable(initializer(shape),
                            dtype=dtype,
                            name=name,
                            constraint=constraint)
        if regularizer is not None:
            with K.name_scope('weight_regularizer'):
                self.add_loss(regularizer(weight))
        if trainable:
            self._trainable_weights.append(weight)
        else:
            self._non_trainable_weights.append(weight)
        return weight
    

    (2) - Indexing this weight matrix

    This is occuring at the call function of Embedding:

    def call(self, inputs):
        if K.dtype(inputs) != 'int32':
            inputs = K.cast(inputs, 'int32')
        out = K.gather(self.embeddings, inputs)
        return out
    

    This functions returns the output of the Embedding layer which is K.gather(self.embeddings, inputs). What tf.keras.backend.gather exactly does is to index the weights matrix self.embeddings (see build function above) according to the inputs which should be lists of positive integers.

    These lists can be retrieved for example if you pass your text/words inputs to the one_hot function of Keras which encodes a text into a list of word indexes of size n (this is NOT one hot encoding - see also this example for more info: https://machinelearningmastery.com/use-word-embedding-layers-deep-learning-keras/).


    Therefore, that's all. There is no matrix multiplication.

    On the contrary, the Keras Embedding layer is only useful because exactly it avoids performing a matrix multiplication and hence it economizes on some computational resources.

    Otherwise, you could just use a Keras Dense layer (after you have encoded your input data) to get a matrix of trainable weights (of (vocabulary_size)x(embedding_dimension) dimensions) and then simply do the multiplication to get the output which will be exactly the same with the output of the Embedding layer.

提交回复
热议问题