How to concatenate word vectors to form sentence vector

╄→гoц情女王★ 提交于 2019-12-04 05:03:01

问题


I have learned in some essays (Tomas Mikolov...) that a better way of forming the vector for a sentence is to concatenate the word-vector.

but due to my clumsy in mathematics, I am still not sure about the details.

for example,

supposing that the dimension of word vector is m; and that a sentence has n words.

what will be the correct result of concatenating operation?

is it a row vector of 1 x m*n ? or a matrix of m x n ?

please advise

thanks


回答1:


There are at least three common ways to combine embedding vectors; (a) summing, (b) summing & averaging or (c) concatenating. So in your case, with concatenating, that would give you a 1 x m*a vector, where a is the number of sentences. In the other cases, the vector length stays the same. See gensim.models.doc2vec.Doc2Vec, dm_concat and dm_mean - it allows you to use any of those three options [1,2].

[1] http://radimrehurek.com/gensim/models/doc2vec.html#gensim.models.doc2vec.LabeledLineSentence

[2] https://github.com/piskvorky/gensim/blob/develop/gensim/models/doc2vec.py



来源:https://stackoverflow.com/questions/36731784/how-to-concatenate-word-vectors-to-form-sentence-vector

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!