In LDA model generates different topics everytime i train on the same corpus , by setting the np.random.seed(0), the LDA model will always be initialized and tr
Yes, default random seed is fixed to 1, as described by the author in https://radimrehurek.com/gensim/models/word2vec.html. Vectors for each word are initialised using a hash of the concatenation of word + str(seed).
Hashing function used, however, is Python’s rudimentary built in hash function and can produce different results if two machines differ in
Above list is not exhaustive. Does it cover your question though?
EDIT
If you want to ensure consistency, you can provide your own hashing function as an argument in word2vec
A very simple (and bad) example would be:
def hash(astring):
return ord(astring[0])
model = Word2Vec(sentences, size=10, window=5, min_count=5, workers=4, hashfxn=hash)
print model[sentences[0][0]]