decision-tree | 易学教程

CART algorithm of matlab 'fitctree' takes account on the attributes order why ?

阅读更多关于 CART algorithm of matlab 'fitctree' takes account on the attributes order why ?

问题 here is an example mentionning that fitctree of matlab takes into account the features order ! why ? load ionosphere % Contains X and Y variables Mdl = fitctree(X,Y) view(Mdl,'mode','graph'); X1=fliplr(X); Mdl1 = fitctree(X1,Y) view(Mdl1,'mode','graph'); Not the same model, thus not the same classification accuracy despite dealing with the same features ? 回答1: In your example, X contains 34 predictors. The predictors contain no names and fitctree just refers to them by their column numbers x1

What is the purpose of using StringIO in DecisionTree

阅读更多关于 What is the purpose of using StringIO in DecisionTree

问题 I am writing a decision tree and the following code is a part of the complete code: def show_tree(tree, features, path): f = io.StringIO() export_graphviz(tree, out_file=f, feature_names=features) pydotplus.graph_from_dot_data(f.getvalue()).write_png(path) img = misc.imread(path) plt.rcParams['figure.figsize'] = (20,20) plt.imshow(img) Could any one please tell me what is the purpose of using StringIO here? 回答1: Python is not my leading language, however I think answer for your question is

What is the purpose of using StringIO in DecisionTree

阅读更多关于 What is the purpose of using StringIO in DecisionTree

I am writing a decision tree and the following code is a part of the complete code: def show_tree(tree, features, path): f = io.StringIO() export_graphviz(tree, out_file=f, feature_names=features) pydotplus.graph_from_dot_data(f.getvalue()).write_png(path) img = misc.imread(path) plt.rcParams['figure.figsize'] = (20,20) plt.imshow(img) Could any one please tell me what is the purpose of using StringIO here? Python is not my leading language, however I think answer for your question is quite simple and does not require lot of research. StringIO is used here to maintain Input/Output text stream.

R Error: “In numerical expression has 19 elements: only the first used”

阅读更多关于 R Error: “In numerical expression has 19 elements: only the first used”

I created a dataframe: totalDeposit <- cumsum(testd$TermDepositAMT[s1$ix]) which is basically calculating cumulative sum of TermDeposit amounts in testd dataframe and storing it in total deposit. This works perfrectly ok. I then need to calculate the average of the deposit amount and i use the following: avgDeposit <- totalDeposit / (1:testd) but get an error message Error in 1:testd : NA/NaN argument In addition: Warning message: In 1:testd : numerical expression has 19 elements: only the first used testd has some 8000 observations and 19 variables. Could someone help me get past this issue?

Can sklearn DecisionTreeClassifier truly work with categorical data?

阅读更多关于 Can sklearn DecisionTreeClassifier truly work with categorical data?

While working with the DecisionTreeClassifier I visualized it using graphviz , and I have to say, to my astonishment, it seems it takes categorical data and uses it as continuous data. All my features are categorical and for example you can see the following tree (please note that the first feature, X[0], has 6 possible values 0, 1, 2, 3, 4, 5: From what I found here the class uses a tree class which is a binary tree, so it is a limitation in sklearn. Anyone knows a way that I am missing to use the tree categorically? (I know it is not better for the task but as I need categories currently I

Getting decision path to a node in sklearn

阅读更多关于 Getting decision path to a node in sklearn

I wanted the decision path (i.e the set of rules) from the root node to a given node (which I supply) in a decision tree (DecisionTreeClassifier) in scikit-learn. clf.decision_path specifies the nodes a sample goes through, which may help in getting the set of rules followed by the sample, but how do you get the set of rules up to a particular node in the tree? For the decision rules of the nodes using the iris dataset : from sklearn.datasets import load_iris from sklearn import tree import graphviz iris = load_iris() clf = tree.DecisionTreeClassifier() clf = clf.fit(iris.data, iris.target)

Getting decision path to a node in sklearn

阅读更多关于 Getting decision path to a node in sklearn

问题 I wanted the decision path (i.e the set of rules) from the root node to a given node (which I supply) in a decision tree (DecisionTreeClassifier) in scikit-learn. clf.decision_path specifies the nodes a sample goes through, which may help in getting the set of rules followed by the sample, but how do you get the set of rules up to a particular node in the tree? 回答1: For the decision rules of the nodes using the iris dataset : from sklearn.datasets import load_iris from sklearn import tree

How to extract the splitting rules for the terminal nodes of ctree()

阅读更多关于 How to extract the splitting rules for the terminal nodes of ctree()

I have a data set with 6 categorical variables with levels ranging from 5 to 28. I have obtained an output from ctree() (party package) with 17 terminal nodes. I have followed the inputs by @Galled from ctree() - How to get the list of splitting conditions for each terminal node? to arrive at my desired output. But, I'm getting the following error post running the code: Error in data.frame(ResulTable, Means, Counts) : arguments imply differing number of rows: 17, 2 I have tried adding this extra lines: ResulTable <- rbind(ResulTable, cbind(Node = Node, Path = Path2)) ResulTable$Node <-

What does the value of 'leaf' in the following xgboost model tree diagram means?

阅读更多关于 What does the value of 'leaf' in the following xgboost model tree diagram means?

I am guessing that it is conditional probability given that the above (tree branch) condition exists. However, I am not clear on it. If you want to read more about the data used or how do we get this diagram then go to : http://machinelearningmastery.com/visualize-gradient-boosting-decision-trees-xgboost-python/ Attribute leaf is the predicted value. In other words, if the evaluation of a tree model ends at that terminal node (aka leaf node), then this is the value that is returned. In pseudocode (the left-most branch of your tree model): if(f1 < 127.5){ if(f7 < 28.5){ if(f5 < 45.4){ return 0

Dictionary object to decision tree in Pydot

阅读更多关于 Dictionary object to decision tree in Pydot

I have a dictionary object as such: menu = {'dinner':{'chicken':'good','beef':'average','vegetarian':{'tofu':'good','salad':{'caeser':'bad','italian':'average'}},'pork':'bad'}} I'm trying to create a graph (decision tree) using pydot with the 'menu' data this . 'Dinner' would be the top node and its values (chicken, beef, etc.) are below it. Referring to the link, the graph function takes two parameters; a source and a node. It would look something like this : Except 'king' would be 'dinner' and 'lord' would be 'chicken', 'beef', etc. My question is: How do I access a key in a value? To create