问题
I have a list of identically structured lists as follows:
test1 <- list(first = data.frame(col1 = c(1,2), col2 = c(3,4)),
second = data.frame(COL1 = c(100,200), COL2 = c(300, 400)))
test2 <- list(first = data.frame(col1 = c(5,6), col2 = c(7,8)),
second = data.frame(COL1 = c(500,600), COL2 = c(700,800)))
orig.list <- list(test1, test2)
I want to:
- Bind the rows the first element of each nested list together, bind the rows 2nd element of each nested list together, etc.
- Recombine the resulting elements into a single list with an identical structure to the first list.
I can easily do this element by element via:
firsts <- orig.list %>% purr::map(1) %>% dplyr::bind_rows()
seconds <- orig.list %>% purr::map(2) %>% dplyr::bind_rows()
new.list <- list(first = firsts, second = seconds)
However, for n list elements this requires that I:
- know the number of elements in each list,
- know the names and orders of the elements so I can recreate the new list with the correct names and order,
- copy and past the same line of code over and over again.
I'm looking for how to apply purrr:map (or some other tidyverse function) more generically to combine all elements of a list of lists, preserving the element names and order.
回答1:
Under the simplest cases as you've shown with your data, you can use pmap
to walk through the list in parallel and bind_rows
to combine individual data frames:
library(tidyverse)
pmap(orig.list, bind_rows)
#$first
# col1 col2
#1 1 3
#2 2 4
#3 5 7
#4 6 8
#$second
# COL1 COL2
#1 100 300
#2 200 400
#3 500 700
#4 600 800
identical(pmap(orig.list, bind_rows), new.list)
# [1] TRUE
To make this a little bit more generic, i.e. handles cases where the number of elements and order of names in each sublist can vary, you can use:
map(map_df(orig.list, ~ as.data.frame(map(.x, ~ unname(nest(.))))), bind_rows)
i.e. you nest each sub list as a data frame, and let bind_rows
to check the names for you.
Test Cases:
With test1
the same, switch the order of the elements in test2
:
test2 <- list(second = data.frame(COL1 = c(500,600), COL2 = c(700,800)),
first = data.frame(col1 = c(5,6), col2 = c(7,8)))
orig.list1 <- list(test1, test2)
map(map_df(orig.list1, ~ as.data.frame(map(.x, ~ unname(nest(.))))), bind_rows)
gives:
#$first
# col1 col2
#1 1 3
#2 2 4
#3 5 7
#4 6 8
#$second
# COL1 COL2
#1 100 300
#2 200 400
#3 500 700
#4 600 800
Now drop one element from test2:
test2 <- list(first = data.frame(col1 = c(5,6), col2 = c(7,8)))
orig.list2 <- list(test1, test2)
map(map_df(orig.list2, ~ as.data.frame(map(.x, ~ unname(nest(.))))), bind_rows)
gives:
#$first
# col1 col2
#1 1 3
#2 2 4
#3 5 7
#4 6 8
#$second
# COL1 COL2
#1 100 300
#2 200 400
回答2:
You want purrr::transpose
:
library(purrr)
library(dplyr)
transpose(orig.list) %>% map(bind_rows)
# $first
# col1 col2
# 1 1 3
# 2 2 4
# 3 5 7
# 4 6 8
#
# $second
# COL1 COL2
# 1 100 300
# 2 200 400
# 3 500 700
# 4 600 800
来源:https://stackoverflow.com/questions/46504036/combine-every-ith-element-of-a-list-of-lists-together-using-dplyr-purrr