Keras- Embedding layer

后端未结

关注

 3  1094

梦如初夏 2021-02-06 05:06

What does input_dim, output_dim and input_length mean in:

Embedding(input_dim, output_dim, input_length)

3条回答

甜味超标 (楼主)

2021-02-06 05:28

It order to use words for natural language processing or machine learning tasks, it is necessary to first map them onto a continuous vector space, thus creating word vectors or word embeddings. The Keras Embedding layer is useful for constructing such word vectors.

input_dim : the vocabulary size. This is how many unique words are represented in your corpus.

output_dim : the desired dimension of the word vector. For example, if output_dim = 100, then every word will be mapped onto a vector with 100 elements, whereas if output_dim = 300, then every word will be mapped onto a vector with 300 elements.

input_length : the length of your sequences. For example, if your data consists of sentences, then this variable represents how many words there are in a sentence. As disparate sentences typically contain different number of words, it is usually required to pad your sequences such that all sentences are of equal length. The keras.preprocessing.pad_sequence method can be used for this (https://keras.io/preprocessing/sequence/).

In Keras, it is possible to either 1) use pretrained word vectors such as GloVe or word2vec representations, or 2) learn the word vectors as part of the training process. This blog post (https://blog.keras.io/using-pre-trained-word-embeddings-in-a-keras-model.html) offers a tutorial on how to use GloVe pretrained word vectors. For option 2, Keras will randomly initialize vectors as the default option, and then learn optimal word vectors during the training process.

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...