Gensim: TypeError: doc2bow expects an array of unicode tokens on input, not a single string

前端 未结 3 768
庸人自扰
庸人自扰 2020-12-06 06:05

I am starting with some python task, I am facing a problem while using gensim. I am trying to load files from my disk and process them (split them and lowercase() them)

3条回答
  •  臣服心动
    2020-12-06 06:44

    Hello everyone i ran into the same problem. This is what worked for me

        #Tokenize the sentence into words
        tokens = [word for word in sentence.split()]
    
        #Create dictionary
        dictionary = corpora.Dictionary([tokens])
        print(dictionary)
    

提交回复
热议问题