结巴分词
1.文本形式: import jieba text = '我们在野生动物园玩' wordlist = jieba.lcut ( text ) # wordlist默认是列表形式 print ( wordlist ) 输出结果: 2.文件形式 import jieba import jieba.analyse jieba.load_userdict ( "D:/python course/wordsegment/dict/dict.txt" ) # 匹配的词语词典 jieba.analyse.set_stop_words ( "D:/python course/wordsegment/dict/stop_words.txt" ) # 停用词词表 def splitSentence ( inputFile, outputFile ) : fin = open ( 'D:\python course\wordsegment\data\input.txt' , 'r' ) # 待分词文本 fout = open ( 'D:\python course\wordsegment\data\output.txt' , 'w' ) # 分词结果 for line in fin: line = line.strip ( ) line = jieba.analyse.extract_tags (