By doc we can use this to read a word2vec model with genism
model = KeyedVectors.load_word2vec_format(\'word2vec.50d.txt\', binary=False)
This
Even simpler solution would be to enumerate index2word
index2word
word2index = {token: token_index for token_index, token in enumerate(w2v.index2word)} word2index['hi'] == 30308 # True