converting uneven hierarchical list to a data frame

五迷三道 提交于 2019-12-06 02:42:24

I don't know about elegant, but this works. Those more familiar with plyr could probably provide a more general solution.

cleanFun <- function(x) {
   a <- x[["atbat"]]
   b <- do.call(rbind,a[names(a)=="pitch"])
   c <- as.data.frame(b)
}
ldply(xml.list[c("top","bottom")], cleanFun)[,1:5]
     .id             des  id type      x
1    top            Ball 310    B  70.39
2    top   Called Strike 311    S 118.45
3    top   Called Strike 312    S  86.70
4    top In play, out(s) 313    X  79.83
5 bottom            Ball 335    B  15.45
6 bottom   Called Strike 336    S  77.25
7 bottom Swinging Strike 337    S  99.57
8 bottom            Ball 338    B 106.44
9 bottom In play, out(s) 339    X 134.76

The .id feature for the ldply() is nice, but it seems like they overlap once you do another ldply().

Here is fairly general function that uses rbind.fill():

aho <- ldply(llply(xml.list[[1]], function(x) ldply(x, function(x) rbind.fill(data.frame(t(x))))))
> aho[1:5,1:4]
     .id                                                       des   id type
1  pitch                                                      Ball  310    B
2  pitch                                             Called Strike  311    S
3  pitch                                             Called Strike  312    S
4  pitch                                           In play, out(s)  313    X
5 .attrs Alexei Ramirez lines out to second baseman Ian Kinsler.   <NA> <NA>

The .id for the second ldply() is missing because we already had an .id. We could fix this by naming the first .id as a different name, but it doesn't seem coherent.

aho2 <- ldply(llply(xml.list[[1]], function(x) {
  out <- ldply(x, function(x) rbind.fill(data.frame(t(x))))
  names(out)[1] <- ".id2"
  out
}))
> aho2[1:5,1:4]
    .id   .id2                                                       des   id
1 atbat  pitch                                                      Ball  310
2 atbat  pitch                                             Called Strike  311
3 atbat  pitch                                             Called Strike  312
4 atbat  pitch                                           In play, out(s)  313
5 atbat .attrs Alexei Ramirez lines out to second baseman Ian Kinsler.   <NA>
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!