Combine every ith element of a list of lists together using dplyr, purrr

问题

I have a list of identically structured lists as follows:

    test1 <- list(first = data.frame(col1 = c(1,2), col2 = c(3,4)), 
                  second = data.frame(COL1 = c(100,200), COL2 = c(300, 400)))

    test2 <- list(first = data.frame(col1 = c(5,6), col2 = c(7,8)), 
                  second = data.frame(COL1 = c(500,600), COL2 = c(700,800)))

    orig.list <- list(test1, test2)

I want to:

Bind the rows the first element of each nested list together, bind the rows 2nd element of each nested list together, etc.
Recombine the resulting elements into a single list with an identical structure to the first list.

I can easily do this element by element via:

    firsts <- orig.list %>% purr::map(1) %>% dplyr::bind_rows()
    seconds <- orig.list %>% purr::map(2) %>% dplyr::bind_rows()

    new.list <- list(first = firsts, second = seconds)

However, for n list elements this requires that I:

know the number of elements in each list,
know the names and orders of the elements so I can recreate the new list with the correct names and order,
copy and past the same line of code over and over again.

I'm looking for how to apply purrr:map (or some other tidyverse function) more generically to combine all elements of a list of lists, preserving the element names and order.

回答1:

Under the simplest cases as you've shown with your data, you can use pmap to walk through the list in parallel and bind_rows to combine individual data frames:

library(tidyverse)
pmap(orig.list, bind_rows)

#$first
#  col1 col2
#1    1    3
#2    2    4
#3    5    7
#4    6    8

#$second
#  COL1 COL2
#1  100  300
#2  200  400
#3  500  700
#4  600  800

identical(pmap(orig.list, bind_rows), new.list)
# [1] TRUE

To make this a little bit more generic, i.e. handles cases where the number of elements and order of names in each sublist can vary, you can use:

map(map_df(orig.list, ~ as.data.frame(map(.x, ~ unname(nest(.))))), bind_rows)

i.e. you nest each sub list as a data frame, and let bind_rows to check the names for you.

Test Cases:

With test1 the same, switch the order of the elements in test2:

test2 <- list(second = data.frame(COL1 = c(500,600), COL2 = c(700,800)),
              first = data.frame(col1 = c(5,6), col2 = c(7,8)))

orig.list1 <- list(test1, test2)

map(map_df(orig.list1, ~ as.data.frame(map(.x, ~ unname(nest(.))))), bind_rows)

gives:

#$first
#  col1 col2
#1    1    3
#2    2    4
#3    5    7
#4    6    8

#$second
#  COL1 COL2
#1  100  300
#2  200  400
#3  500  700
#4  600  800

Now drop one element from test2:

test2 <- list(first = data.frame(col1 = c(5,6), col2 = c(7,8)))
orig.list2 <- list(test1, test2)

map(map_df(orig.list2, ~ as.data.frame(map(.x, ~ unname(nest(.))))), bind_rows)

gives:

#$first
#  col1 col2
#1    1    3
#2    2    4
#3    5    7
#4    6    8

#$second
#  COL1 COL2
#1  100  300
#2  200  400

回答2:

You want purrr::transpose :

library(purrr)
library(dplyr)
transpose(orig.list) %>% map(bind_rows)

# $first
# col1 col2
# 1    1    3
# 2    2    4
# 3    5    7
# 4    6    8
# 
# $second
# COL1 COL2
# 1  100  300
# 2  200  400
# 3  500  700
# 4  600  800

来源：https://stackoverflow.com/questions/46504036/combine-every-ith-element-of-a-list-of-lists-together-using-dplyr-purrr

标签

dplyr

purrr