wordcloud库
- 词云:专门用于根据文本生成词云
- wordcloud默认会将空格,标点当作分隔符
wordcloud.WordCloud()
对于中文文本,先将其jieba拆分(返回的是列表),再用空格进行拼接
红楼梦词云
import jieba
f = open('红楼梦.txt', 'r')
txt = f.read()
f.close()
words = jieba.lcut(t)
counts = {}
for word in words:
if len(word) == 1: # 排出单个字符的分词结果
continue
else:
counts[word] == counts.get(word, 0) + 1
items = list(counts.items())
items.sort(key=lambda x:x[1], reverse=True)
for i in range(15):
word, count = items[i]
print("{0:<10}{1:>5}".format(word, count)
import jieba
from wordcloud import WordCloud
f = open("红楼梦.txt", 'r')
txt = f.read()
f.close()
words = jieba.lcut(txt)
newtxt = ' '.join(words)
wordcloud = WordCloud(background_color='white',width=800, height=600,
font_path='mysh.ttc', max_words=200, max_font_size=80,
).generate(newtxt)
wordcloud.to_file('红楼梦词云.png')
来源:CSDN
作者:我是小杨我就这样
链接:https://blog.csdn.net/weixin_44478378/article/details/104589522