发表新帖

发表新帖

Why are word embedding actually vectors?

前端未结

关注

 4  1439

臣服心动 2020-12-09 23:27

I am sorry for my naivety, but I don\'t understand why word embeddings that are the result of NN training process (word2vec) are actually vectors.

Embedding is the p

4条回答

没有蜡笔的小新 (楼主)

2020-12-10 00:33
the process does nothing that applies vector arithmetic

The training process has nothing to do with vector arithmetic, but when the arrays are produced, it turns out they have pretty nice properties, so that one can think of "word linear space".

For example, what words have embeddings closest to a given word in this space?

Put it differently, words with similar meaning form a cloud. Here's a 2-D t-SNE representation:

Another example, the distance between "man" and "woman" is very close to the distance between "uncle" and "aunt":

As a result, you have pretty much reasonable arithmetic:
```
W("woman") − W("man") ≃ W("aunt") − W("uncle")
W("woman") − W("man") ≃ W("queen") − W("king")
```
So it's not far fetched to call them vectors. All pictures are from this wonderful post that I very much recommend to read.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题