naivebayes | 易学教程

Naive Bayesian for Topic detection using “Bag of Words” approach

阅读更多关于 Naive Bayesian for Topic detection using “Bag of Words” approach

问题 I am trying to implement a naive bayseian approach to find the topic of a given document or stream of words. Is there are Naive Bayesian approach that i might be able to look up for this ? Also, i am trying to improve my dictionary as i go along. Initially, i have a bunch of words that map to a topics (hard-coded). Depending on the occurrence of the words other than the ones that are already mapped. And depending on the occurrences of these words i want to add them to the mappings, hence

what is the approach used by BernoulliNB in sklearn package for prediction?

阅读更多关于 what is the approach used by BernoulliNB in sklearn package for prediction?

问题 I was reading up on the implementation of naive bayes in Sklearn, and I was not able to understand the predict part of BernoulliNB: Code borrowed from source def _joint_log_likelihood(self, X): #.. some code ommited neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) # Compute neg_prob · (1 - X).T as ∑neg_prob - X · neg_prob jll = safe_sparse_dot(X, (self.feature_log_prob_ - neg_prob).T) jll += self.class_log_prior_ + neg_prob.sum(axis=1) return jll What is the role of neg_prob in this. Can

what is the approach used by BernoulliNB in sklearn package for prediction?

阅读更多关于 what is the approach used by BernoulliNB in sklearn package for prediction?

How to run naive Bayes from NLTK with Python Pandas?

阅读更多关于 How to run naive Bayes from NLTK with Python Pandas?

问题 I have a csv file with feature (people's names) and label (people's ethnicities). I am able to set up the data frame using Python Pandas, but when I try to link that with NLTK module to run a naive Bayes, I get the following error: Traceback (most recent call last): File "C:\Users\Desktop\file.py", line 19, in <module> classifier = nbc.train(train_set) File "E:\Program Files Extra\Python27\lib\site-packages\nltk\classify\naivebayes.py", line 194, in train for fname, fval in featureset.items()

Text analysis-Unable to write output of Python program in csv or xls file

阅读更多关于 Text analysis-Unable to write output of Python program in csv or xls file

问题 Hi I am trying to do a sentiment analysis using Naive Bayes classifier in python 2.x. It reads the sentiment using a txt file and then gives output as positive or negative based on the sample txt file sentiments. I want the output the same form as input e.g. I have a text file of lets sat 1000 raw sentiments and I want the output to show positive or negative against each sentiment. Please help. Below is the code i am using import math import string def Naive_Bayes_Classifier(positive,

Unable to use Pandas and NLTK to train Naive Bayes (machine learning) in Python

阅读更多关于 Unable to use Pandas and NLTK to train Naive Bayes (machine learning) in Python

问题 Here is what I am trying to do. I have a csv. file with column 1 with people's names (ie: "Michael Jordan", "Anderson Silva", "Muhammad Ali") and column 2 with people's ethnicity (ie: English, French, Chinese). In my code, I create the pandas data frame using all the data. Then create additional data frames: one with only Chinese names and another one with only non-Chinese names. And then I create separate lists. The three_split function extracts the feature of each name by splitting them

Opencv3 Bayes Classifier predictProb giving strange results

阅读更多关于 Opencv3 Bayes Classifier predictProb giving strange results

问题 I am training an Opencv Bayes Classifier on various features of people for re-identification. The classifier appears to be working and predict() is giving an output, but the predictProb() function is returning very large values for some features and 0 for others. Data of the form: (this is one row of the matrix) 0.2523284, 0.027687496, 0.0042156572, 0.0018417788, 5.1221455e-06, 0.00030639244, 3.1830291e-07; gives probabilities of the order 1.0710769e+21 Data of the form: (this is one row of

Load Naïve Bayes model in java code using weka jar

阅读更多关于 Load Naïve Bayes model in java code using weka jar

问题 I have used weka and made a Naive Bayes classifier, by using weka GUI. Then I have saved this model by following this tutorial. Now I want to load this model through Java code but I am unable to find any way to load a saved model using weka. This is my requirement that I have to made model separately and then use it in a separate program. If anyone can guide me in this regard I will be thankful to you. 回答1: You can easily load a saved model in java using this command: Classifier myCls =

How to create Training data for Text classification on 4 categories

阅读更多关于 How to create Training data for Text classification on 4 categories

问题 My machine learning goal is to search for potential risks (will cost more money) and opportunities (will save money) from a Project Requirements document. My idea is to classify sentences from the data into one of these categories: Risk, Opportunity and Irrelevant (no risk, no opportunity, default categorie). I will use a multinomial Bayes classifier for this with tf-dif. Now I need to have data for my training set and test set. The way I will do this is label every sentence from requirement

Multiclass Classification and probability prediction

阅读更多关于 Multiclass Classification and probability prediction

问题 import pandas as pd import numpy from sklearn import cross_validation from sklearn.naive_bayes import GaussianNB fi = "df.csv" # Open the file for reading and read in data file_handler = open(fi, "r") data = pd.read_csv(file_handler, sep=",") file_handler.close() # split the data into training and test data train, test = cross_validation.train_test_split(data,test_size=0.6, random_state=0) # initialise Gaussian Naive Bayes naive_b = GaussianNB() train_features = train.ix[:,0:127] train_label