问题
I am training a word2vec model from the tensorflow tutorial.
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/word2vec/word2vec_basic.py
After training I get the embedding matrix. I would like to save this and import it as a trained model in gensim.
To load a model in gensim, the command is:
model = Word2Vec.load_word2vec_format(fn, binary=True)
But how do I generate the fn file from Tensorflow?
Thanks
回答1:
One way to is save the file in the non-binary Word2Vec format, which essentially looks like this:
num_words vector_size # this is the header
label0 x00 x01 ... x0N
label1 x10 x11 ... x1N
...
Example:
2 3
word0 -0.000737 -0.002106 0.001851
word1 -0.000878 -0.002106 0.002834
Save the file and then load with kwarg binary=False:
model = Word2Vec.load_word2vec_format(filename, binary=False)
print(model['word0'])
Update
New way to load model is:
from gensim.models.keyedvectors import KeyedVectors
model = KeyedVectors.load_word2vec_format(model_path, binary=False)
来源:https://stackoverflow.com/questions/42186543/training-wordvec-in-tensorflow-importing-to-gensim