Implementing Bag-of-Words Naive-Bayes classifier in NLTK

前端 未结 3 1655
无人共我
无人共我 2020-12-02 07:44

I basically have the same question as this guy.. The example in the NLTK book for the Naive Bayes classifier considers only whether a word occurs in a document as a feature.

3条回答
  •  野趣味
    野趣味 (楼主)
    2020-12-02 08:08

    The features in the NLTK bayes classifier are "nominal", not numeric. This means they can take a finite number of discrete values (labels), but they can't be treated as frequencies.

    So with the Bayes classifier, you cannot directly use word frequency as a feature-- you could do something like use the 50 more frequent words from each text as your feature set, but that's quite a different thing

    But maybe there are other classifiers in the NLTK that depend on frequency. I wouldn't know, but have you looked? I'd say it's worth checking out.

提交回复
热议问题