发表新帖

发表新帖

Embedding in pytorch

后端未结

关注

 4  794

清歌不尽 2021-01-30 08:38

I have checked the PyTorch tutorial and questions similar to this one on Stackoverflow.

I get confused; does the embedding in pytorch (Embedding) make the similar words

4条回答

梦谈多话 (楼主)

2021-01-30 09:41
nn.Embedding holds a Tensor of dimension (vocab_size, vector_size), i.e. of the size of the vocabulary x the dimension of each vector embedding, and a method that does the lookup.

When you create an embedding layer, the Tensor is initialised randomly. It is only when you train it when this similarity between similar words should appear. Unless you have overwritten the values of the embedding with a previously trained model, like GloVe or Word2Vec, but that's another story.

So, once you have the embedding layer defined, and the vocabulary defined and encoded (i.e. assign a unique number to each word in the vocabulary) you can use the instance of the nn.Embedding class to get the corresponding embedding.

For example:
```
import torch
from torch import nn
embedding = nn.Embedding(1000,128)
embedding(torch.LongTensor([3,4]))
```
will return the embedding vectors corresponding to the word 3 and 4 in your vocabulary. As no model has been trained, they will be random.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题