what does the vector of a word in word2vec represents?

后端 未结 2 1184
粉色の甜心
粉色の甜心 2021-01-30 07:49

word2vec is a open source tool by Google:

  • For each word it provides a vector of float values, what exactly do they represent?

  • There is also a p

2条回答
  •  耶瑟儿~
    2021-01-30 08:15

    Fixed width contexts for each word are used as input into a neural network. The output of the network is a vector of float values - aka the word embedding - of a given dimension (typically 50 or 100). The network is trained so as to provide good word embedding given the train/test corpus.

    One can easily come up with a fixed size input for any word - say M words to the left and N words to the right of it. How to do so for a sentence or paragraph, whose sizes vary, is not as apparent, or at least it wasn't at first. Without reading the paper first, I'm guessing one can combine the fixed-width embedding of all the words in the sentence/paragraph to come up with a fixed-length vector embedding for a sentence/paragraph.

提交回复
热议问题