Given a model, e.g.
from gensim.models.word2vec import Word2Vec
documents = [\"Human machine interface for lab abc computer applications\",
\"A survey of u
There is no direct way to do what you are looking for. However, you are not completely lost. The method most_similar
is implemented in the class WordEmbeddingsKeyedVectors (check the link). You can take a look at this method and modify it to suit your needs.
The lines shown below perform the actual logic of computing the similar words, you need to replace the variable limited
with vectors corresponding to words of your interest. Then you are done
limited = self.vectors_norm if restrict_vocab is None else self.vectors_norm[:restrict_vocab]
dists = dot(limited, mean)
if not topn:
return dists
best = matutils.argsort(dists, topn=topn + len(all_words), reverse=True)
Update:
limited = self.vectors_norm if restrict_vocab is None else self.vectors_norm[:restrict_vocab]
If you see this line, it means if restrict_vocab
is used it restricts top n words in the vocab, it is meaningful only if you have sorted the vocab by frequency. If you are not passing restrict_vocab, self.vectors_norm
is what goes into limited
the method most_similar calls another method init_sims. This initializes the value for [self.vector_norm][4]
like shown below
self.vectors_norm = (self.vectors / sqrt((self.vectors ** 2).sum(-1))[..., newaxis]).astype(REAL)
so, you can pickup the words that you are interested in, prepare their norm and use it in place of limited. This should work