How to print the LDA topics models from gensim? Python

后端未结

关注

 10  1663

Using gensim I was able to extract topics from a set of documents in LSA but how do I access the topics generated from the LDA models?

When printing the

相关标签:

10条回答

醉梦人生

2020-12-04 14:21

Are you using any logging? print_topics prints to the logfile as stated in the docs.

As @mac389 says, lda.show_topics() is the way to go to print to screen.

0 讨论(0)
发布评论:

提交评论
- 加载中...

南方客

2020-12-04 14:21

Using Gensim for cleaning it's own topic format.

from gensim.parsing.preprocessing import preprocess_string, strip_punctuation,
strip_numeric

lda_topics = lda.show_topics(num_words=5)

topics = []
filters = [lambda x: x.lower(), strip_punctuation, strip_numeric]

for topic in lda_topics:
    print(topic)
    topics.append(preprocess_string(topic[1], filters))

print(topics)

Output :

(0, '0.020*"business" + 0.018*"data" + 0.012*"experience" + 0.010*"learning" + 0.008*"analytics"')
(1, '0.027*"data" + 0.020*"experience" + 0.013*"business" + 0.010*"role" + 0.009*"science"')
(2, '0.026*"data" + 0.016*"experience" + 0.012*"learning" + 0.011*"machine" + 0.009*"business"')
(3, '0.028*"data" + 0.015*"analytics" + 0.015*"experience" + 0.008*"business" + 0.008*"skills"')
(4, '0.014*"data" + 0.009*"learning" + 0.009*"machine" + 0.009*"business" + 0.008*"experience"')


[
  ['business', 'data', 'experience', 'learning', 'analytics'], 
  ['data', 'experience', 'business', 'role', 'science'], 
  ['data', 'experience', 'learning', 'machine', 'business'], 
  ['data', 'analytics', 'experience', 'business', 'skills'], 
  ['data', 'learning', 'machine', 'business', 'experience']
]

0 讨论(0)

走了就别回头了

2020-12-04 14:21

Here is sample code to print topics:

def ExtractTopics(filename, numTopics=5):
    # filename is a pickle file where I have lists of lists containing bag of words
    texts = pickle.load(open(filename, "rb"))

    # generate dictionary
    dict = corpora.Dictionary(texts)

    # remove words with low freq.  3 is an arbitrary number I have picked here
    low_occerance_ids = [tokenid for tokenid, docfreq in dict.dfs.iteritems() if docfreq == 3]
    dict.filter_tokens(low_occerance_ids)
    dict.compactify()
    corpus = [dict.doc2bow(t) for t in texts]
    # Generate LDA Model
    lda = models.ldamodel.LdaModel(corpus, num_topics=numTopics)
    i = 0
    # We print the topics
    for topic in lda.show_topics(num_topics=numTopics, formatted=False, topn=20):
        i = i + 1
        print "Topic #" + str(i) + ":",
        for p, id in topic:
            print dict[int(id)],

        print ""

0 讨论(0)

南方客

2020-12-04 14:25
Recently, came across a similar issue while working with Python 3 and Gensim 2.3.0. print_topics() and show_topics() weren't giving any error but also not printing anything. Turns out that show_topics() returns a list. So one can simply do:
```
topic_list = show_topics()
print(topic_list)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2