How to get percentages from decision tree for each node

倖福魔咒の 提交于 2019-12-07 19:17:45

问题


How could I create a table that includes the percentages for each node in the plot below?

library(rpart)
library(rattle)
library(rpart.plot)
library(RColorBrewer)

fit <- rpart(Species ~ ., data=iris, method="class")
fancyRpartPlot(fit)

It results in this plot:

I would like to output a table with species as the first column and the associated percent at each node in a second column. A second iteration of the table would exclude the first node (100%) and also remove duplicates by retaining the row that contains a higher percentage.

After picking through the "rpart" documentation I'm still unable to figure out how to create this table. Please let me know what you think.

Thank you for your time.


回答1:


The where element of the rpart-object is the predicted class for the terminal nodes. You can get this in a table with:

> iris$where <- fit$where
> with(iris, table(Species, where))
            where
Species       2  4  5
  setosa     50  0  0
  versicolor  0 49  1
  virginica   0  5 45

I'm guessing you want the column sums divided by the total counts?

> 100*colSums(with(iris, table(Species, where)) )/150
       2        4        5 
33.33333 36.00000 30.66667 


来源:https://stackoverflow.com/questions/27727149/how-to-get-percentages-from-decision-tree-for-each-node

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!