naivebayes

Naive Bayesian for Topic detection using “Bag of Words” approach

那年仲夏 提交于 2020-01-22 04:25:29
问题 I am trying to implement a naive bayseian approach to find the topic of a given document or stream of words. Is there are Naive Bayesian approach that i might be able to look up for this ? Also, i am trying to improve my dictionary as i go along. Initially, i have a bunch of words that map to a topics (hard-coded). Depending on the occurrence of the words other than the ones that are already mapped. And depending on the occurrences of these words i want to add them to the mappings, hence

what is the approach used by BernoulliNB in sklearn package for prediction?

痞子三分冷 提交于 2020-01-16 15:27:17
问题 I was reading up on the implementation of naive bayes in Sklearn, and I was not able to understand the predict part of BernoulliNB: Code borrowed from source def _joint_log_likelihood(self, X): #.. some code ommited neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) # Compute neg_prob · (1 - X).T as ∑neg_prob - X · neg_prob jll = safe_sparse_dot(X, (self.feature_log_prob_ - neg_prob).T) jll += self.class_log_prior_ + neg_prob.sum(axis=1) return jll What is the role of neg_prob in this. Can

what is the approach used by BernoulliNB in sklearn package for prediction?

╄→гoц情女王★ 提交于 2020-01-16 15:25:09
问题 I was reading up on the implementation of naive bayes in Sklearn, and I was not able to understand the predict part of BernoulliNB: Code borrowed from source def _joint_log_likelihood(self, X): #.. some code ommited neg_prob = np.log(1 - np.exp(self.feature_log_prob_)) # Compute neg_prob · (1 - X).T as ∑neg_prob - X · neg_prob jll = safe_sparse_dot(X, (self.feature_log_prob_ - neg_prob).T) jll += self.class_log_prior_ + neg_prob.sum(axis=1) return jll What is the role of neg_prob in this. Can

How to run naive Bayes from NLTK with Python Pandas?

人盡茶涼 提交于 2020-01-15 12:16:07
问题 I have a csv file with feature (people's names) and label (people's ethnicities). I am able to set up the data frame using Python Pandas, but when I try to link that with NLTK module to run a naive Bayes, I get the following error: Traceback (most recent call last): File "C:\Users\Desktop\file.py", line 19, in <module> classifier = nbc.train(train_set) File "E:\Program Files Extra\Python27\lib\site-packages\nltk\classify\naivebayes.py", line 194, in train for fname, fval in featureset.items()

Text analysis-Unable to write output of Python program in csv or xls file

耗尽温柔 提交于 2020-01-15 10:59:12
问题 Hi I am trying to do a sentiment analysis using Naive Bayes classifier in python 2.x. It reads the sentiment using a txt file and then gives output as positive or negative based on the sample txt file sentiments. I want the output the same form as input e.g. I have a text file of lets sat 1000 raw sentiments and I want the output to show positive or negative against each sentiment. Please help. Below is the code i am using import math import string def Naive_Bayes_Classifier(positive,

Unable to use Pandas and NLTK to train Naive Bayes (machine learning) in Python

生来就可爱ヽ(ⅴ<●) 提交于 2020-01-03 04:46:07
问题 Here is what I am trying to do. I have a csv. file with column 1 with people's names (ie: "Michael Jordan", "Anderson Silva", "Muhammad Ali") and column 2 with people's ethnicity (ie: English, French, Chinese). In my code, I create the pandas data frame using all the data. Then create additional data frames: one with only Chinese names and another one with only non-Chinese names. And then I create separate lists. The three_split function extracts the feature of each name by splitting them

Opencv3 Bayes Classifier predictProb giving strange results

Deadly 提交于 2019-12-25 16:55:16
问题 I am training an Opencv Bayes Classifier on various features of people for re-identification. The classifier appears to be working and predict() is giving an output, but the predictProb() function is returning very large values for some features and 0 for others. Data of the form: (this is one row of the matrix) 0.2523284, 0.027687496, 0.0042156572, 0.0018417788, 5.1221455e-06, 0.00030639244, 3.1830291e-07; gives probabilities of the order 1.0710769e+21 Data of the form: (this is one row of

Load Naïve Bayes model in java code using weka jar

北战南征 提交于 2019-12-25 08:28:44
问题 I have used weka and made a Naive Bayes classifier, by using weka GUI. Then I have saved this model by following this tutorial. Now I want to load this model through Java code but I am unable to find any way to load a saved model using weka. This is my requirement that I have to made model separately and then use it in a separate program. If anyone can guide me in this regard I will be thankful to you. 回答1: You can easily load a saved model in java using this command: Classifier myCls =

How to create Training data for Text classification on 4 categories

♀尐吖头ヾ 提交于 2019-12-25 07:26:59
问题 My machine learning goal is to search for potential risks (will cost more money) and opportunities (will save money) from a Project Requirements document. My idea is to classify sentences from the data into one of these categories: Risk, Opportunity and Irrelevant (no risk, no opportunity, default categorie). I will use a multinomial Bayes classifier for this with tf-dif. Now I need to have data for my training set and test set. The way I will do this is label every sentence from requirement

Multiclass Classification and probability prediction

佐手、 提交于 2019-12-24 08:57:16
问题 import pandas as pd import numpy from sklearn import cross_validation from sklearn.naive_bayes import GaussianNB fi = "df.csv" # Open the file for reading and read in data file_handler = open(fi, "r") data = pd.read_csv(file_handler, sep=",") file_handler.close() # split the data into training and test data train, test = cross_validation.train_test_split(data,test_size=0.6, random_state=0) # initialise Gaussian Naive Bayes naive_b = GaussianNB() train_features = train.ix[:,0:127] train_label