Word2Vec: Effect of window size used

后端 未结 2 1854
夕颜
夕颜 2020-12-24 15:06

I am trying to train a word2vec model on very short phrases (5 grams). Since each sentence or example is very short, I believe the window size I can use can atmost be 2. I a

2条回答
  •  暗喜
    暗喜 (楼主)
    2020-12-24 15:36

    To your question: "I am trying to understand what the implications of such a small window size are on the quality of the learned model".

    For example "stackoverflow great website for programmers" with 5 words (suppose we save the stop words great and for here) if the window size is 2 then the vector of word "stackoverflow" is directly affected by the word "great" and "website", if the window size is 5 "stackoverflow" can be directly affected by two more words "for" and "programmers". The 'affected' here means it will pull the vector of two words closer.

    So it depends on the material you are using for training, if the window size of 2 can capture the context of a word, but 5 is chosen, it will decrease the quality of the learnt model, and vise versa.

提交回复
热议问题