Different results of LDA using R(topicmodels)

╄→гoц情女王★ 提交于 2019-12-12 02:52:19

问题


I am using R topicmodels to train an LDA model from a small corpus, but I find that every time I repeat the same code, it has the different results (different topics and different topic terms) My question is why the same condition and same corpus has the different result every time, and what should I do to stabilize the result? Here is my code:

library(tm)
library(topicmodels)
cname<-file.path(".","corpus","train")
docs<-Corpus(DirSource(cname))
toSpace<-content_transformer(function(x,pattern) gsub(pattern,"",x))
docs<-tm_map(docs,toSpace,"/")
docs<-tm_map(docs,toSpace,"@")
docs<-tm_map(docs,toSpace,"#")
docs<-tm_map(docs,toSpace,"\\|")
docs<-tm_map(docs,toSpace,"&")
docs<-tm_map(docs,content_transformer(tolower))
docs<-tm_map(docs,removeNumbers)
docs<-tm_map(docs,removePunctuation)
docs<-tm_map(docs,removeWords,stopwords("english"))
docs<-tm_map(docs,removeWords,c("amp"))
docs<-tm_map(docs,stripWhitespace)
dtm<-DocumentTermMatrix(docs)
dtm_LDA<-LDA(dtm,5)
get_terms(dtm_LDA,10)

I have try set.seed, but it seems doesn't work. And I find similar questionsLDA model generates different topics every time I train on the same corpus, but it is a python one.


回答1:


For those who come across same issue. You can try set the value of random seed as fixed by specifying the control attribute in LDA function as below. Find more information here.

lda <- LDA(AssociatedPress[1:20, ], control=list(seed=0), k=2)


来源:https://stackoverflow.com/questions/31742181/different-results-of-lda-using-rtopicmodels

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!