weka | 易学教程

How do you plot a CostSensitiveClassifier tree in R?

阅读更多关于 How do you plot a CostSensitiveClassifier tree in R?

问题 In this case I'm using the RWeka package and J48 within the Cost Sensitive Classifier function. I know with the package "party" I can plot a normal J48 tree, but not sure how to get a plot with the CSC output. library(RWeka) csc <- CostSensitiveClassifier(Species ~ ., data = iris, control = Weka_control(`cost-matrix` = matrix(c(0,10, 0, 0, 0, 0, 0, 10, 0), ncol = 3), W = "weka.classifiers.trees.J48", M = TRUE)) csc CostSensitiveClassifier using minimized expected misclasification cost weka

WEKA: how to get the score from classifyInstance?

阅读更多关于 WEKA: how to get the score from classifyInstance?

问题 I'm using a FilteredClassifier.classifyInstance() to classify my instances in weka. I have 2 classes (true and false) and I have many positives, so I actually need to know the score of each isntance to get the best positive. You know how I could get the score from my weka classifier ? thanks Update: I've also tried to use distributionForInstance, but for each instance I always get an array with [1.0, 0.0]. I actually need to compare several instances to see which one is the most reliable,

R: Clustering results are different everytime I run

阅读更多关于 R: Clustering results are different everytime I run

问题 library(amap) set.seed(5) Kmeans(mydata, 5, iter.max=500, nstart=1, method="euclidean") in 'amap' package and run several times, but even though the parameters and seed value are always the same, the clustering results are different every time I run Kmeans, or other cluster methods. I tried another kmeans function in different packages, but still the same... In fact, I want to use the Weka and R together, so I also tried SimpleKMeans in RWeka package, and this gives always the same value.

Error in plot, formula missing

阅读更多关于 Error in plot, formula missing

问题 I am trying to plot my svm model. library(foreign) library(e1071) x <- read.arff("contact-lenses.arff") #alt: x <- read.arff("http://storm.cis.fordham.edu/~gweiss/data-mining/weka-data/contact-lenses.arff") model <- svm(`contact-lenses` ~ . , data = x, type = "C-classification", kernel = "linear") The contact lens arff is the inbuilt data file in weka. However, now i run into an error trying to plot the model. plot(model, x) Error in plot.svm(model, x) : missing formula. 回答1: The problem is

How to use LibSVM with Weka in my Java code?

阅读更多关于 How to use LibSVM with Weka in my Java code?

I want to use LibSVM classifier with Weka in my application. How can I (or where can I find good examples to) do this? andrew A little late now, surely, but I'll answer anyways. You have to use weka.jar, libsvm.jar, and wlsvm.jar (the libsvm wrapper) in your project. So just include all 3 jars in your build path or class path or whatever. You can get the wlsvm.jar from here: http://ailab.ist.psu.edu/yasser/wlsvm.html You can get weka from here: http://www.cs.waikato.ac.nz/ml/weka/ And you can get libsvm from here: http://www.csie.ntu.edu.tw/~cjlin/libsvm/ I could not get this to work with weka

How to get predication value for an instance in weka?

阅读更多关于 How to get predication value for an instance in weka?

I am working on Weka and need to output the predication values (probabilities) of each labels for each test instance. In GUI there is an option in classify tab as (classify -> options -> Output predicted value) which does this work by outputting the prediction probabilities for each label but how to do this in java code. I want to receive probability scores for each label after classifying it ? The following code takes in a set of training instances, and outputs the predicted probability for a specific instance. import weka.classifiers.trees.J48; import weka.core.Instances; public class Main {

Cross Validation in Weka

阅读更多关于 Cross Validation in Weka

I've always thought from what I read that cross validation is performed like this: In k-fold cross-validation, the original sample is randomly partitioned into k subsamples. Of the k subsamples, a single subsample is retained as the validation data for testing the model, and the remaining k − 1 subsamples are used as training data. The cross-validation process is then repeated k times (the folds), with each of the k subsamples used exactly once as the validation data. The k results from the folds then can be averaged (or otherwise combined) to produce a single estimation So k models are built

How to reuse saved classifier created from explorer(in weka) in eclipse java

阅读更多关于 How to reuse saved classifier created from explorer(in weka) in eclipse java

问题 I have created a classifier in WEKA, i saved it on my hard-disk, now I want to use that classifier in eclipse using weka api. How can i do this? please guide me to this... thank you 回答1: Here is an example of loading a model to predict the value of instances. The example model is a J48 decision tree created and saved in the Weka Explorer. It was built from the nominal weather data provided with Weka. It is called "tree.model". //load model String rootPath="/some/where/"; Classifier cls =

Creating a string attribute in Weka Java API

阅读更多关于 Creating a string attribute in Weka Java API

问题 I'm trying to create a new string Attribute using Weka's Java API... Reading through the API javadocs, it appears that the way to do so is to use this constructor: Attribute public Attribute(java.lang.String attributeName, FastVector attributeValues) Constructor for nominal attributes and string attributes. If a null vector of attribute values is passed to the method, the attribute is assumed to be a string. Parameters: attributeName - the name for the attribute attributeValues - a vector of

Running DBSCAN in ELKI

阅读更多关于 Running DBSCAN in ELKI

问题 I am trying to cluster some geospatial data, and I previously tried the WEKA library. I found this benchmarking, and decided to try ELKI. Despite the advice to not use ELKI as a Java library (which is suppose to be less maintained than the UI), I incorporated it in my application, and I can say that I am quite happy about the results. The structures that it uses to store data, are far more efficient than the ones used by Weka, and the fact that it has the option of using a spatial index is