decision-tree

CART algorithm of matlab 'fitctree' takes account on the attributes order why ?

这一生的挚爱 提交于 2019-12-02 04:23:15
问题 here is an example mentionning that fitctree of matlab takes into account the features order ! why ? load ionosphere % Contains X and Y variables Mdl = fitctree(X,Y) view(Mdl,'mode','graph'); X1=fliplr(X); Mdl1 = fitctree(X1,Y) view(Mdl1,'mode','graph'); Not the same model, thus not the same classification accuracy despite dealing with the same features ? 回答1: In your example, X contains 34 predictors. The predictors contain no names and fitctree just refers to them by their column numbers x1

What is the purpose of using StringIO in DecisionTree

二次信任 提交于 2019-12-01 22:29:15
问题 I am writing a decision tree and the following code is a part of the complete code: def show_tree(tree, features, path): f = io.StringIO() export_graphviz(tree, out_file=f, feature_names=features) pydotplus.graph_from_dot_data(f.getvalue()).write_png(path) img = misc.imread(path) plt.rcParams['figure.figsize'] = (20,20) plt.imshow(img) Could any one please tell me what is the purpose of using StringIO here? 回答1: Python is not my leading language, however I think answer for your question is

What is the purpose of using StringIO in DecisionTree

萝らか妹 提交于 2019-12-01 21:09:50
I am writing a decision tree and the following code is a part of the complete code: def show_tree(tree, features, path): f = io.StringIO() export_graphviz(tree, out_file=f, feature_names=features) pydotplus.graph_from_dot_data(f.getvalue()).write_png(path) img = misc.imread(path) plt.rcParams['figure.figsize'] = (20,20) plt.imshow(img) Could any one please tell me what is the purpose of using StringIO here? Python is not my leading language, however I think answer for your question is quite simple and does not require lot of research. StringIO is used here to maintain Input/Output text stream.

R Error: “In numerical expression has 19 elements: only the first used”

*爱你&永不变心* 提交于 2019-12-01 20:05:24
I created a dataframe: totalDeposit <- cumsum(testd$TermDepositAMT[s1$ix]) which is basically calculating cumulative sum of TermDeposit amounts in testd dataframe and storing it in total deposit. This works perfrectly ok. I then need to calculate the average of the deposit amount and i use the following: avgDeposit <- totalDeposit / (1:testd) but get an error message Error in 1:testd : NA/NaN argument In addition: Warning message: In 1:testd : numerical expression has 19 elements: only the first used testd has some 8000 observations and 19 variables. Could someone help me get past this issue?

Can sklearn DecisionTreeClassifier truly work with categorical data?

笑着哭i 提交于 2019-12-01 17:41:47
While working with the DecisionTreeClassifier I visualized it using graphviz , and I have to say, to my astonishment, it seems it takes categorical data and uses it as continuous data. All my features are categorical and for example you can see the following tree (please note that the first feature, X[0], has 6 possible values 0, 1, 2, 3, 4, 5: From what I found here the class uses a tree class which is a binary tree, so it is a limitation in sklearn. Anyone knows a way that I am missing to use the tree categorically? (I know it is not better for the task but as I need categories currently I

Getting decision path to a node in sklearn

巧了我就是萌 提交于 2019-12-01 07:12:59
I wanted the decision path (i.e the set of rules) from the root node to a given node (which I supply) in a decision tree (DecisionTreeClassifier) in scikit-learn. clf.decision_path specifies the nodes a sample goes through, which may help in getting the set of rules followed by the sample, but how do you get the set of rules up to a particular node in the tree? For the decision rules of the nodes using the iris dataset : from sklearn.datasets import load_iris from sklearn import tree import graphviz iris = load_iris() clf = tree.DecisionTreeClassifier() clf = clf.fit(iris.data, iris.target)

Getting decision path to a node in sklearn

回眸只為那壹抹淺笑 提交于 2019-12-01 04:28:57
问题 I wanted the decision path (i.e the set of rules) from the root node to a given node (which I supply) in a decision tree (DecisionTreeClassifier) in scikit-learn. clf.decision_path specifies the nodes a sample goes through, which may help in getting the set of rules followed by the sample, but how do you get the set of rules up to a particular node in the tree? 回答1: For the decision rules of the nodes using the iris dataset : from sklearn.datasets import load_iris from sklearn import tree

How to extract the splitting rules for the terminal nodes of ctree()

放肆的年华 提交于 2019-12-01 01:16:26
I have a data set with 6 categorical variables with levels ranging from 5 to 28. I have obtained an output from ctree() (party package) with 17 terminal nodes. I have followed the inputs by @Galled from ctree() - How to get the list of splitting conditions for each terminal node? to arrive at my desired output. But, I'm getting the following error post running the code: Error in data.frame(ResulTable, Means, Counts) : arguments imply differing number of rows: 17, 2 I have tried adding this extra lines: ResulTable <- rbind(ResulTable, cbind(Node = Node, Path = Path2)) ResulTable$Node <-

What does the value of 'leaf' in the following xgboost model tree diagram means?

≡放荡痞女 提交于 2019-11-30 18:37:12
I am guessing that it is conditional probability given that the above (tree branch) condition exists. However, I am not clear on it. If you want to read more about the data used or how do we get this diagram then go to : http://machinelearningmastery.com/visualize-gradient-boosting-decision-trees-xgboost-python/ Attribute leaf is the predicted value. In other words, if the evaluation of a tree model ends at that terminal node (aka leaf node), then this is the value that is returned. In pseudocode (the left-most branch of your tree model): if(f1 < 127.5){ if(f7 < 28.5){ if(f5 < 45.4){ return 0

Dictionary object to decision tree in Pydot

独自空忆成欢 提交于 2019-11-30 15:18:24
I have a dictionary object as such: menu = {'dinner':{'chicken':'good','beef':'average','vegetarian':{'tofu':'good','salad':{'caeser':'bad','italian':'average'}},'pork':'bad'}} I'm trying to create a graph (decision tree) using pydot with the 'menu' data this . 'Dinner' would be the top node and its values (chicken, beef, etc.) are below it. Referring to the link, the graph function takes two parameters; a source and a node. It would look something like this : Except 'king' would be 'dinner' and 'lord' would be 'chicken', 'beef', etc. My question is: How do I access a key in a value? To create