I am using the Gensim HDP module on a set of documents.
>>> hdp = models.HdpModel(corpusB, id2word=dictionaryB)
>>> topics = hdp.print_topics(
There is apparently a bug in Gensim(version 3.8.3), in which giving -1
to show_topics
doesn't return anything at all. So I have tweaked the answers by Roko Mijic and aaron.
def topic_prob_extractor(gensim_hdp):
shown_topics = gensim_hdp.show_topics(num_topics=gensim_hdp.m_T, formatted=False)
topics_nos = [x[0] for x in shown_topics ]
weights = [ sum([item[1] for item in shown_topics[topicN][1]]) for topicN in topics_nos ]
return pd.DataFrame({'topic_id' : topics_nos, 'weight' : weights})