How to convert a list of lists to a dataframe - non-identical lists

前端 未结 3 408
闹比i
闹比i 2020-12-17 05:59

I have a list where each element is a named list, but the elements are not the same everywhere. I have read solutions on how to convert lists of lists to dataframes here and

3条回答
  •  悲哀的现实
    2020-12-17 06:44

    Considering that you are OK with the resulting matrix being all of the same type (say, character), you can try to write your own function, like this:

    list2mat <- function(inList) {
      UL <- unlist(inList)
      Nam <- unique(names(UL))
      M <- matrix(NA_character_, 
                  nrow = length(inList), ncol = length(Nam), 
                  dimnames = list(NULL, Nam))
      Row <- rep(seq_along(inList), sapply(inList, length))
      Col <- match(names(UL), Nam)
      M[cbind(Row, Col)] <- UL
      M
    }
    

    Usage would be:

    list2mat(lisnotOK)
    #      a   b   c       d     
    # [1,] "1" "2" "hi"    NA    
    # [2,] NA  "2" "hello" "nope"
    

    This should be pretty fast since everything is pre-allocated and you are making use of matrix indexing.


    Update: Benchmarks (since you said efficiency was a concern)

    fun1 <- function(inList) ldply(inList, data.frame)
    fun2 <- function(inList) list2mat(inList)
    
    library(microbenchmark)
    microbenchmark(fun1(lisnotOK), fun2(lisnotOK))
    # Unit: microseconds
    #            expr      min        lq    median       uq      max neval
    #  fun1(lisnotOK) 4193.808 4340.0585 4523.3000 4912.233 7600.341   100
    #  fun2(lisnotOK)  163.784  182.3865  211.2515  236.910  363.489   100
    
    L2 <- unlist(replicate(1000, lisnotOK, simplify=FALSE), recursive=FALSE)
    microbenchmark(fun1(L2), fun2(L2), times = 10)
    # Unit: milliseconds
    #      expr        min         lq     median         uq        max neval
    #  fun1(L2) 3032.71572 3106.79006 3196.17178 3306.11756 3609.67445    10
    #  fun2(L2)   24.16817   24.86991   25.65569   27.44128   29.41908    10
    

提交回复
热议问题