How to convert a list of lists to a dataframe - non-identical lists

前端未结

关注

 3  408

闹比i 2020-12-17 05:59

I have a list where each element is a named list, but the elements are not the same everywhere. I have read solutions on how to convert lists of lists to dataframes here and

3条回答

悲哀的现实 (楼主)

2020-12-17 06:44

Considering that you are OK with the resulting matrix being all of the same type (say, character), you can try to write your own function, like this:

list2mat <- function(inList) {
  UL <- unlist(inList)
  Nam <- unique(names(UL))
  M <- matrix(NA_character_, 
              nrow = length(inList), ncol = length(Nam), 
              dimnames = list(NULL, Nam))
  Row <- rep(seq_along(inList), sapply(inList, length))
  Col <- match(names(UL), Nam)
  M[cbind(Row, Col)] <- UL
  M
}

Usage would be:

list2mat(lisnotOK)
#      a   b   c       d     
# [1,] "1" "2" "hi"    NA    
# [2,] NA  "2" "hello" "nope"

This should be pretty fast since everything is pre-allocated and you are making use of matrix indexing.

Update: Benchmarks (since you said efficiency was a concern)

fun1 <- function(inList) ldply(inList, data.frame)
fun2 <- function(inList) list2mat(inList)

library(microbenchmark)
microbenchmark(fun1(lisnotOK), fun2(lisnotOK))
# Unit: microseconds
#            expr      min        lq    median       uq      max neval
#  fun1(lisnotOK) 4193.808 4340.0585 4523.3000 4912.233 7600.341   100
#  fun2(lisnotOK)  163.784  182.3865  211.2515  236.910  363.489   100

L2 <- unlist(replicate(1000, lisnotOK, simplify=FALSE), recursive=FALSE)
microbenchmark(fun1(L2), fun2(L2), times = 10)
# Unit: milliseconds
#      expr        min         lq     median         uq        max neval
#  fun1(L2) 3032.71572 3106.79006 3196.17178 3306.11756 3609.67445    10
#  fun2(L2)   24.16817   24.86991   25.65569   27.44128   29.41908    10

0 讨论(0)

查看其它3个回答