how to specify random_state in LDA model for topic modelling

大城市里の小女人 提交于 2021-02-11 15:10:31

问题


I read the gensim LDA model documentation about random_state which states that:

random_state ({np.random.RandomState, int}, optional) 

– Either a randomState object or a seed to generate one. Useful for reproducibility.

I have been tring put random_state=42 or

random_seed=42
state=np.random.RandomState(random_seed)
state.randn(1)
random_state=state.randn(1) 

which did not work. Can anyone suggest what should i do

model=ldaModel(corpus=corpus, id2word=dictionary, num_topics=num_topics, random_state=None)

I tied to use it without random_state the function it works but with random_state i got error message saying LDA model is not defined

def compute_coherence_values(dictionary, corpus, texts, limit, random_state, start=2, step=3):

coherence_values = []
model_list = []
for num_topics in range(start, limit, step):
    #model=LdaModel(corpus=corpus, id2word=dictionary, num_topics=num_topics)
    model=ldaModel(corpus=corpus, id2word=dictionary, num_topics=num_topics, 
                                                  random_state)
    model_list.append(model)
    coherencemodel = CoherenceModel(model=model, texts=texts, dictionary=dictionary, coherence='c_v')
    coherence_values.append(coherencemodel.get_coherence())

return model_list, coherence_values

回答1:


The mistake in your code is in here:

 model=ldaModel(corpus=corpus, id2word=dictionary, num_topics=num_topics, 
                                                  random_state)

You can't just pass the variable random_state without specifying the label. Just passing the variable to the method with an int number means nothing to the ldaModel method, since the method does not take positional parameter. The method takes named parameters. So it should be like this:

model=ldaModel(corpus=corpus, id2word=dictionary, num_topics=num_topics, 
                                                  random_state = random_state)

I have an implementation of the LDA that uses LatentDirichletAllocation from sklearn.decomposition, and for the random_state it takes an integer. Here is an example:

lda_model = LatentDirichletAllocation(n_components=10,        
                                  max_iter=10,               
                                  learning_method='online',   
                                  random_state=100,          
                                  batch_size=128,            
                                  evaluate_every = -1,       
                                  n_jobs = -1 )

Here is a good tutorial on how to implement and LDA



来源:https://stackoverflow.com/questions/61373994/how-to-specify-random-state-in-lda-model-for-topic-modelling

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!