Displaying inference tree node values with “print”

狂风中的少年 提交于 2019-12-24 05:08:05

问题


I apologize in advance if I butcher this question as I'm very new to R and statistical analysis in general.

I've generated a conditional inference tree using the party library.
When I plot(my_tree, type = "simple") I get a result like this:

When I print(my_tree) I get a result like this:

1) SOME_VALUE <= 2.5; criterion = 1, statistic = 1306.478
  2) SOME_VALUE <= -10.5; criterion = 1, statistic = 173.416
    3) SOME_VALUE <= -16; criterion = 1, statistic = 19.385
      4)*  weights = 275 
    3) SOME_VALUE > -16
      5)*  weights = 261 
  2) SOME_VALUE > -10.5
    6) SOME_VALUE <= -2.5; criterion = 1, statistic = 24.094
      7) SOME_VALUE <= -6.5; criterion = 0.974, statistic = 4.989
        8)*  weights = 346 
      7) SOME_VALUE > -6.5
        9)*  weights = 563 
    6) SOME_VALUE > -2.5
      10)*  weights = 442 
1) SOME_VALUE > 2.5
  11) SOME_VALUE <= 10; criterion = 1, statistic = 225.148
    12) SOME_VALUE <= 6.5; criterion = 1, statistic = 18.789
      13)*  weights = 648 
    12) SOME_VALUE > 6.5
      14)*  weights = 473 
  11) SOME_VALUE > 10
    15) SOME_VALUE <= 16; criterion = 1, statistic = 51.729
      16)*  weights = 595 
    15) SOME_VALUE > 16
      17) SOME_VALUE <= 23.5; criterion = 0.997, statistic = 8.931
        18)*  weights = 488 
      17) SOME_VALUE > 23.5
        19)*  weights = 365 

I prefer the output of print, but it seems to be lacking the y = (0.96, 0.04) values.

Ideally, I would like my output to look something like this:

1) SOME_VALUE <= 2.5; criterion = 1, statistic = 1306.478
  2) SOME_VALUE <= -10.5; criterion = 1, statistic = 173.416
    3) SOME_VALUE <= -16; criterion = 1, statistic = 19.385
      4)*  weights = 275; y = (0.96, 0.04)
    3) SOME_VALUE > -16
      5)*  weights = 261; y = (0.831, 0.169)
  2) SOME_VALUE > -10.5
...

How do I go about accomplishing this?


回答1:


It is possible to do this with the partykit package (the successor to party) but even there it requires some hacking. In principle, the print() function is customizable with panel functions for inner and terminal nodes etc. But they do not look very nice even for seemingly simple tasks like this one.

As you appear to have used a tree with a bivariate response, let's consider this simple (albeit not very meaningful) reproducible example:

library("partykit")
airq <- subset(airquality, !is.na(Ozone))
ct <- ctree(Ozone + Wind ~ ., data = airq)

For the inner nodes let's assume we just want to show the p-value that is readily available in the $info of each node. We can format this via:

ip <- function(node) formatinfo_node(node,
  prefix = " ",
  FUN = function(info) paste0("[p = ", format.pval(info$p.value), "]")
)

For the terminal nodes we want to show the number of observations (assuming no weights have been used) and the mean response. Both are pre-computed in small tables and then accessed via the $id of each node:

n <- table(ct$fitted[["(fitted)"]])
m <- aggregate(ct$fitted[["(response)"]], list(ct$fitted[["(fitted)"]]), mean)
m <- apply(m[, -1], 1, function(x) paste(round(x, digits = 3), collapse = ", "))
names(m) <- names(n)

The panel function is then defined by:

tp <- function(node) formatinfo_node(node,
  prefix = ": ",
  FUN = function(info) paste0(
    "n = ", n[as.character(node$id)],
    ", y = (", m[as.character(node$id)], ")"
  )
)

To apply this in the print() method we need to call print.party() directly because currently print.constparty() does not pass this on correctly. (We will have to fix this in the partykit package.)

print.party(ct, inner_panel = ip, terminal_panel = tp)
## [1] root
## |   [2] Temp <= 82 [p = 0.0044842]
## |   |   [3] Temp <= 77: n = 52, y = (18.615, 11.562)
## |   |   [4] Temp > 77: n = 27, y = (41.815, 9.737)
## |   [5] Temp > 82: n = 37, y = (75.405, 7.565)

This is hopefully close to what you wanted to do and should give you a template for further modifications.



来源:https://stackoverflow.com/questions/33356122/displaying-inference-tree-node-values-with-print

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!