How to get the topic probability for each document for topic modeling using LDA

♀尐吖头ヾ 提交于 2020-12-07 07:33:17


I use scikit-learn LDA to generate LDA model and after that I can get the topic-terms. I am wondering how can I get the probability of each topic for each document?


Use the transform method of the LatentDirichletAllocation class after fitting the model. It will return the document topic distribution.

If you work with the example given in the documentation for scikit-learn's Latent Dirichlet Allocation, the document topic distribution can be accessed by appending the following line to the code:

doc_topic_dist = lda.transform(tf)

Here, lda is the trained LDA model and tf is the document word matrix.

