Training wordvec in Tensorflow, importing to Gensim

生来就可爱ヽ(ⅴ<●) 提交于 2020-01-02 09:24:24

问题


I am training a word2vec model from the tensorflow tutorial.

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/word2vec/word2vec_basic.py

After training I get the embedding matrix. I would like to save this and import it as a trained model in gensim.

To load a model in gensim, the command is:

model = Word2Vec.load_word2vec_format(fn, binary=True)

But how do I generate the fn file from Tensorflow?

Thanks


回答1:


One way to is save the file in the non-binary Word2Vec format, which essentially looks like this:

num_words vector_size  # this is the header
label0 x00 x01 ... x0N
label1 x10 x11 ... x1N
...

Example:

2 3
word0 -0.000737 -0.002106 0.001851
word1 -0.000878 -0.002106 0.002834

Save the file and then load with kwarg binary=False:

model = Word2Vec.load_word2vec_format(filename, binary=False)

print(model['word0'])

Update

New way to load model is:

from gensim.models.keyedvectors import KeyedVectors

model = KeyedVectors.load_word2vec_format(model_path, binary=False)


来源:https://stackoverflow.com/questions/42186543/training-wordvec-in-tensorflow-importing-to-gensim

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!