Dealing with negative values in sklearn MultinomialNB

与世无争的帅哥 提交于 2019-12-04 04:22:46

I recommend you that don't use Naive Bayes with SVD or other matrix factorization because Naive Bayes based on applying Bayes' theorem with strong (naive) independence assumptions between the features. Use other classifier, for example RandomForest

I tried this experiment with this results:

vectorizer = TfidfVectorizer(max_df=0.5, stop_words='english', use_idf=True)
lsa = NMF(n_components=100)
mnb = MultinomialNB(alpha=0.01)

train_text = vectorizer.fit_transform(raw_text_train)
train_text = lsa.fit_transform(train_text)
train_text = Normalizer(copy=False).fit_transform(train_text)

mnb.fit(train_text, train_labels)

This is the same case but I'm using NMP(non-negative matrix factorization) instead SVD and got 0,04% accuracy.

Changing the classifier MultinomialNB for RandomForest i got 79% accuracy.

Therefore change the classifier or don't apply a matrix factorization.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!