Why are word embedding actually vectors?

前端 未结 4 1439
臣服心动
臣服心动 2020-12-09 23:27

I am sorry for my naivety, but I don\'t understand why word embeddings that are the result of NN training process (word2vec) are actually vectors.

Embedding is the p

4条回答
  •  没有蜡笔的小新
    2020-12-10 00:33

    the process does nothing that applies vector arithmetic

    The training process has nothing to do with vector arithmetic, but when the arrays are produced, it turns out they have pretty nice properties, so that one can think of "word linear space".

    For example, what words have embeddings closest to a given word in this space?

    Put it differently, words with similar meaning form a cloud. Here's a 2-D t-SNE representation:

    Another example, the distance between "man" and "woman" is very close to the distance between "uncle" and "aunt":

    As a result, you have pretty much reasonable arithmetic:

    W("woman") − W("man") ≃ W("aunt") − W("uncle")
    W("woman") − W("man") ≃ W("queen") − W("king")
    

    So it's not far fetched to call them vectors. All pictures are from this wonderful post that I very much recommend to read.

提交回复
热议问题