weka

Different results in Weka GUI and Weka via Java code

廉价感情. 提交于 2019-12-02 01:52:37
I'm applying a text classification in Weka using NaiveBayesMultinomialText classifier. The problem is that when I use the GUI to do it and test on the same train data (without cross validation) I get 93% acurracy, and when I try do it via java code I get 67% acurracy. What might be wrong? In GUI, I'm using the following configuration: Lnorm 2.0 debug False lowercaseTokens True minWordFrequency 3.0 norm 1.0 normalizeDocLength False periodicPruning 0 stemmer NullStemmer stopwords pt-br-stopwords.dat tokenizer NgramTokenizer (default parameters, but max ngramsize = 2) useStopList True

Too many attributes for ARFF format in Weka

吃可爱长大的小学妹 提交于 2019-12-02 01:28:28
I am working with a data-set of dimension more than 10,000. To use Weka I need to convert text file into ARFF format, but since there are too many attributes even after using sparse ARFF format file size is too large. Is there any similar method as for data to avoid writing so many attribute identifier as in header of ARFF file. for example : @attribute A1 NUMERICAL @attribute A2 NUMERICAL ... ... @attribute A10000 NUMERICAL I coded a script in AWK to format the following lines (in a TXT file) to an ARFF example.txt source: Att_0 | Att_1 | Att_2 | ... | Att_n 1 | 2 | 3 | ... | 999 My script

Weka 3.8.1 can't link to mtj.jar, causing java.lang.ClassNotFoundException: no.uib.cipr.matrix.Matrix

点点圈 提交于 2019-12-01 22:06:56
问题 I'm processing the some data in weka, and I want to use weka API so that I can use my self-defined algorithms. However, when I just want to instantiate the LinearRegression class: LinearRegression myRegression = new LinearRegression() I got the same error as: This person got the same problem and he roll back to version 3.6.12 I checked my weka.jar and I can find mtj.jar is included, so I'm sure there must be somewhere linked inapproriately. Downgrade the API version is not the best option for

Weka predict classifcation node

≡放荡痞女 提交于 2019-12-01 11:53:17
I have created a huge j48 tree of size around 7000 with so many branches and leaves. I am getting classification result as well for test images. I would like to know which of the node is making the classification for each result. In other words, Is there a way with weka to see the id or something of the leaf node that makes the decision. As far as I know, you will not be able to do this from the Weka GUI. However, if you use the Weka API there is some hope . I am not an expert in Java, so the following steps may not follow the best practice, but it does work for the small example I concocted.

how to calculate confidence from weka API?

自作多情 提交于 2019-12-01 11:06:46
I am using the weka java API, I can get the predicted class label after training on the training set. double pred = fc.classifyInstance(test.instance(i)); But I want to know the confidence probability of the class label, what function should I use ? In the GUI I can select the output prediction to a txt file and can get the probability easily, but I want to know how to get through the code. I am using J48() classifier. weka.classifiers.Classifier.distributionForInstance(Instance) Predicts the class memberships for a given instance. If an instance is unclassified, the returned array elements

Define input data for clustering using WEKA API

◇◆丶佛笑我妖孽 提交于 2019-12-01 09:42:10
I want to cluster points specified by latitude and longitude. I am using WEKA API The problem is with Instances instances = new Instances(40.01,1.02); So, how to specify input data without using ARFF file? I would like just to read an array into Instances . import java.io.Reader; import weka.clusterers.ClusterEvaluation; import weka.clusterers.SimpleKMeans; import weka.core.Instances; public class test { /** * @param args */ public static void main(String[] args) { Instances instances = new Instances(40.01,1.02); SimpleKMeans simpleKMeans = new SimpleKMeans(); simpleKMeans.buildClusterer

how to calculate confidence from weka API?

非 Y 不嫁゛ 提交于 2019-12-01 08:32:04
问题 I am using the weka java API, I can get the predicted class label after training on the training set. double pred = fc.classifyInstance(test.instance(i)); But I want to know the confidence probability of the class label, what function should I use ? In the GUI I can select the output prediction to a txt file and can get the probability easily, but I want to know how to get through the code. I am using J48() classifier. 回答1: weka.classifiers.Classifier.distributionForInstance(Instance)

Define input data for clustering using WEKA API

孤者浪人 提交于 2019-12-01 06:19:12
问题 I want to cluster points specified by latitude and longitude. I am using WEKA API The problem is with Instances instances = new Instances(40.01,1.02); So, how to specify input data without using ARFF file? I would like just to read an array into Instances . import java.io.Reader; import weka.clusterers.ClusterEvaluation; import weka.clusterers.SimpleKMeans; import weka.core.Instances; public class test { /** * @param args */ public static void main(String[] args) { Instances instances = new

WEKA classification likelihood of the classes

最后都变了- 提交于 2019-12-01 05:18:35
I would like to know if there is a way in WEKA to output a number of 'best-guesses' for a classification. My scenario is: I classify the data with cross-validation for instance, then on weka's output I get something like: these are the 3 best-guesses for the classification of this instance. What I want is like, even if an instance isn't correctly classified i get an output of the 3 or 5 best-guesses for that instance. Example: Classes: A,B,C,D,E Instances: 1...10 And output would be: instance 1 is 90% likely to be class A, 75% likely to be class B, 60% like to be class C.. Thanks. Weka's API

Weka GUI - Not enough memory, won't load?

混江龙づ霸主 提交于 2019-12-01 05:16:40
This same installation of Weka has loaded for me in the past. I am simply trying to load the Weka GUI (double click on the icon) and I get the following error. How can I fix it? OutOfMemory Not enough memory. Please load a smaller dataset or use a larger heap size. - initial JVM size: 122.4 MB - total memory used: 165.3 MB - max. memory avail.: 227.6 MB Note: The Java heap size can be specified with the -Xmx option. etc.. I am not loading Weka from the command line, so how can I stop this from occurring? I'm not sure why you were able to use it before but not now. However, you can specify a