Python: clustering similar words based on word2vec
问题 This might be the naive question which I am about to ask. I have a tokenized corpus on which I have trained Gensim's Word2vec model. The code is as below site = Article("http://www.datasciencecentral.com/profiles/blogs/blockchain-and-artificial-intelligence-1") site.download() site.parse() def clean(doc): stop_free = " ".join([i for i in word_tokenize(doc.lower()) if i not in stop]) punc_free = ''.join(ch for ch in stop_free if ch not in exclude) normalized = " ".join(lemma.lemmatize(word)