decision-tree

How to print out the predicted class after cross-validation in WEKA

我是研究僧i 提交于 2019-12-21 05:16:15
问题 Once a 10-fold cross-validation is done with a classifier, how can I print out the prediced class of every instance and the distribution of these instances? J48 j48 = new J48(); Evaluation eval = new Evaluation(newData); eval.crossValidateModel(j48, newData, 10, new Random(1)); When I tried something similar to below, it said that the classifier is not built . for (int i=0; i<data.numInstances(); i++){ System.out.println(j48.distributionForInstance(newData.instance(i))); } What I'm trying to

Exact implementation of RandomForest in Weka 3.7

天涯浪子 提交于 2019-12-21 03:15:11
问题 Having reviewed the original Breiman (2001) paper as well as some other board posts, I am slightly confused with the actual procedure used by WEKAs random forest implementation. None of the sources was sufficiently elaborate, many even contradict each other. How does it work in detail, which steps are carried out? My understanding till now: For each tree a bootstrap sample of the same size as the training data is created Only a random subset of the available features of defined size

Multivariate Decision Tree learner

笑着哭i 提交于 2019-12-21 02:08:38
问题 A lot univariate decision tree learner implementations (C4.5 etc) do exist, but does actually someone know multivariate decision tree learner algorithms? 回答1: Bennett and Blue's A Support Vector Machine Approach to Decision Trees does multivariate splits by using embedded SVMs for each decision in the tree. Similarly, in Multicategory classification via discrete support vector machines (2009) , Orsenigo and Vercellis embed a multicategory variant of discrete support vector machines (DSVM)

TicTacToe AI Making Incorrect Decisions

喜夏-厌秋 提交于 2019-12-20 19:57:14
问题 A little background: as a way to learn multinode trees in C++, I decided to generate all possible TicTacToe boards and store them in a tree such that the branch beginning at a node are all boards that can follow from that node, and the children of a node are boards that follow in one move. After that, I thought it would be fun to write an AI to play TicTacToe using that tree as a decision tree. TTT is a solvable problem where a perfect player will never lose, so it seemed an easy AI to code

How to extract sklearn decision tree rules to pandas boolean conditions?

情到浓时终转凉″ 提交于 2019-12-20 12:32:05
问题 There are so many posts like this about how to extract sklearn decision tree rules but I could not find any about using pandas. Take this data and model for example, as below # Create Decision Tree classifer object clf = DecisionTreeClassifier(criterion="entropy", max_depth=3) # Train Decision Tree Classifer clf = clf.fit(X_train,y_train) The result: Expected: There're 8 rules about this example. From left to right,notice that dataframe is df r1 = (df['glucose']<=127.5) & (df['bmi']<=26.45) &

How to explore a decision tree built using scikit learn

偶尔善良 提交于 2019-12-20 09:19:57
问题 I am building a decision tree using clf = tree.DecisionTreeClassifier() clf = clf.fit(X_train, Y_train) This all works fine. However, how do I then explore the decision tree? For example, how do I find which entries from X_train appear in a particular leaf? 回答1: You need to use the predict method. After training the tree, you feed the X values to predict their output. from sklearn.datasets import load_iris from sklearn.tree import DecisionTreeClassifier clf = DecisionTreeClassifier(random

Python Checking paths to leaf in binary tree python giving data in the leaf

有些话、适合烂在心里 提交于 2019-12-20 03:55:09
问题 Lets say i have this tree: cough Yes / \ No sneezing sneezing Yes / \ No Yes / \ No fever fever fever fever Yes / \ No Yes/ \No Yes / \ No Yes/ \No dead cold influenza cold dead influenza cold healthy And i want the paths to the illness "influenza" What the output should be is like this: [[True,False,True],[False,True,False]] If you go to right of the root it return True ( Yes ) , if you go to Left its False( No) This is the code I have been trying to do for this function but im doing

Color of the node of tree with graphviz using class_names

喜你入骨 提交于 2019-12-20 02:15:38
问题 Expanding on a prior question: Changing colors for decision tree plot created using export graphviz How would I color the nodes of the tree bases on the dominant class (species of iris), instead of a binary distinction? This should require a combination of the iris.target_names, the string describing the class, and iris.target, the class. import pydotplus from sklearn.datasets import load_iris from sklearn import tree import collections clf = tree.DecisionTreeClassifier(random_state=42) iris

How to extract the splitting rules for the terminal nodes of ctree()

北城以北 提交于 2019-12-19 04:38:28
问题 I have a data set with 6 categorical variables with levels ranging from 5 to 28. I have obtained an output from ctree() (party package) with 17 terminal nodes. I have followed the inputs by @Galled from ctree() - How to get the list of splitting conditions for each terminal node? to arrive at my desired output. But, I'm getting the following error post running the code: Error in data.frame(ResulTable, Means, Counts) : arguments imply differing number of rows: 17, 2 I have tried adding this

Prune unnecessary leaves in sklearn DecisionTreeClassifier

巧了我就是萌 提交于 2019-12-19 03:43:08
问题 I use sklearn.tree.DecisionTreeClassifier to build a decision tree. With the optimal parameter settings, I get a tree that has unnecessary leaves (see example picture below - I do not need probabilities, so the leaf nodes marked with red are a unnecessary split) Is there any third-party library for pruning these unnecessary nodes? Or a code snippet? I could write one, but I can't really imagine that I am the first person with this problem... Code to replicate: from sklearn.tree import