Ctree classification with weights - results displayed

懵懂的女人 提交于 2020-01-15 09:09:55

问题


Let's say I want to use the iris data example, but correctly classifying versicolor is 5 times more important to me.

library(party)
data(iris)
irisct <- ctree(Species ~ .,data = iris, weights=ifelse(iris$Species=='versicolor', 5, 1))
plot(irisct)

Then the tree graph changes the number of observations and conditional probabilities in each node (it multiplies versicolor by 5). Is there a way to "disable" this, i.e. show the original number of observations (total = 150 for iris)?

Many thanks for your help!


回答1:


The enhanced reimplementation of ctree() in package partykit also has somewhat more flexible plotting capabilities. Specifically, the node_barplot() panel function gained a mainlab argument that can be used for customizing the main labels. For example for the iris data:

library("partykit")
ct <- ctree(Species ~ ., data = iris)

You can set up a vector of labels and then supply a function that accesses these:

lab <- paste("Foo", 1:7)
ml <- function(id, nobs) lab[as.numeric(id)]
plot(ct, tp_args = list(mainlab = ml))

Of course, the example above is not very meaningful but could be modified to accomplish what you want with a little bit of coding.

However, be warned about the upsampling of certain observations using the weights argument. The ctree() function really treats the weights as case weights and consequently the significance tests used for splitting do change. With increased number of observations, all p-values become smaller and hence the tree selects more splits (unless mincriterion is increased simultaneously). Compare the ct tree above with 4 terminal nodes with

ct2 <- ctree(Species ~ ., data = iris, weights = rep(2, 150))
ct3 <- ctree(Species ~ ., data = iris, weights = rep(2, 150), mincriterion = 0.999)

The resulting number of terminal nodes are

c(width(ct), width(ct2), width(ct3))
[1] 4 6 4


来源:https://stackoverflow.com/questions/27260838/ctree-classification-with-weights-results-displayed

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!