Converting data frame into deeply nested list

六月ゝ 毕业季﹏ 提交于 2019-12-23 09:59:17

问题


I'm trying to create a data structure that the whisker package expects, and I can't seem to figure out how create that structure from my data frame. Let's say I have the following data frame:

library(dplyr)  

existing_format <- 
  mtcars %>% 
    select(carb, gear, cyl) %>% 
    arrange(carb, gear, cyl) %>% 
    distinct() 

...I would like to go from existing_format to the following desired format (only first two elements of desired_format list are shown):

desired_format <- list(
  list( 
    carb = "1",
    gear = list(
      list(gear = "3", cyl = list(list(cyl = "4"), list(cyl = "6"))),
      list(gear = "4", cyl = list(list(cyl = "4")))
    )
  ),
  list( 
    carb = "2",
    gear = list(
      list(gear = "3", cyl = list(list(cyl = "8"))),
      list(gear = "4", cyl = list(list(cyl = "4"))),
      list(gear = "5", cyl = list(list(cyl = "4")))
    )
  )
)

I've tried things like grouping by carb and gear, then using tidyr::nest() to create a nested df, but nothing is doing. Something tells me that whisker::iteratelist() or whisker::rowSplit() is the way forward, but I can't figure it out.

Thanks, Chris


回答1:


Perhaps more flexible than it needs to be in this case, but you can do a recursive split

rsplit<-function(dd) {
  col <- names(dd)[1]
  dat <- dd[[1]]
  xx <- lapply(unique(dat), function(x) {
    z <- setNames(list(x), col)
    if(ncol(dd)>1) {
      z[[names(dd)[2]]] <- rsplit(dd[dat==x,-1, drop=FALSE])
    }
    z
  })
  xx
}

rsplit(existing_format)

This will split on all the columns and use the names from the column headers.




回答2:


Here's a way, not general for n columns, but it works for 3.

library(purrr)
library(magrittr)
library(dplyr)

output <- existing_format                           %>%
    map_df(as.character)                            %>%
    group_by(carb,gear)                             %>%
    summarize_at("cyl",~lst(map(.,~lst(cyl = .x)))) %>%
    mutate(gear = map2(.x = gear,.y = cyl,~lst(gear = .x,cyl = .y))) %>%
    group_by(carb)                                  %>%
    summarize_at("gear",~lst(gear=.))               %$%
    map2(.x = carb,.y = gear,~lst(carb = .x,gear = .y))

identical(output[1:2],desired_format) #TRUE


来源:https://stackoverflow.com/questions/47802545/converting-data-frame-into-deeply-nested-list

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!