What is the difference between an Embedding Layer and a Dense Layer?

前端 未结 2 942
广开言路
广开言路 2020-12-05 07:09

The docs for an Embedding Layer in Keras say:

Turns positive integers (indexes) into dense vectors of fixed size. eg. [[4], [20]] ->

2条回答
  •  爱一瞬间的悲伤
    2020-12-05 07:30

    Mathematically, the difference is this:

    • An embedding layer performs select operation. In keras, this layer is equivalent to:

      K.gather(self.embeddings, inputs)      # just one matrix
      
    • A dense layer performs dot-product operation, plus an optional activation:

      outputs = matmul(inputs, self.kernel)  # a kernel matrix
      outputs = bias_add(outputs, self.bias) # a bias vector
      return self.activation(outputs)        # an activation function
      

    You can emulate an embedding layer with fully-connected layer via one-hot encoding, but the whole point of dense embedding is to avoid one-hot representation. In NLP, the word vocabulary size can be of the order 100k (sometimes even a million). On top of that, it's often needed to process the sequences of words in a batch. Processing the batch of sequences of word indices would be much more efficient than the batch of sequences of one-hot vectors. In addition, gather operation itself is faster than matrix dot-product, both in forward and backward pass.

提交回复
热议问题