topic-modeling

IndexError when trying to update gensim's LdaModel

放肆的年华 提交于 2020-12-26 11:02:18
问题 I am facing the following error when trying to update my gensim's LdaModel: IndexError: index 6614 is out of bounds for axis 1 with size 6614 I checked why were other people having this issue on this thread, but I am using the same dictionary from the beginning to the end, which was their error. As I have a big dataset, I am loading it chunk by chunk (using pickle.load). I am building the dictionary in this way, iteratively, thanks to this piece of code : fr_documents_lda = open("documents

How to get the topic probability for each document for topic modeling using LDA

♀尐吖头ヾ 提交于 2020-12-07 07:33:17
问题 I use scikit-learn LDA to generate LDA model and after that I can get the topic-terms. I am wondering how can I get the probability of each topic for each document? 回答1: Use the transform method of the LatentDirichletAllocation class after fitting the model. It will return the document topic distribution. If you work with the example given in the documentation for scikit-learn's Latent Dirichlet Allocation, the document topic distribution can be accessed by appending the following line to the