Convert Mixed-Length named List to data.frame

前端 未结 6 1523
夕颜
夕颜 2020-12-31 04:52

I have a list of the following format:

[[1]]
[[1]]$a
[1] 1

[[1]]$b
[1] 3

[[1]]$c
[1] 5

[[2]]       
[[2]]$c
[1] 2

[[2]]$a
[1] 3

There i

6条回答
  •  执笔经年
    2020-12-31 05:23

    Well, I gave my first thought a try and the performance wasn't as bad as I was afraid of, but I'm sure there's still room for improvement (especially in the waster matrix -> data.frame conversion).

    convertList <- function(myList, ids){
        #this computes a list of the numerical index for each value to handle the missing/
        # improperly ordered list elements. So it will have a list in which each element 
        # associated with A has a value of 1, B ->2, and C -> 3. So a row containing
        # A=_, C=_, B=_ would have a value of `1,3,2`
        idInd <- lapply(myList, function(x){match(names(x), ids)})
    
        # Calculate the row indices if I were to unlist myList. So if there were two elements
        # in the first row, 3 in the third, and 1 in the fourth, you'd see: 1, 1, 2, 2, 2, 3
        rowInd <- inverse.rle(list(values=1:length(myList), lengths=sapply(myList, length)))
    
        #Unlist the first list created to just be a numerical matrix
        idInd <- unlist(idInd)
    
        #create a grid of addresses. The first column is the row address, the second is the col
        address <- cbind(rowInd, idInd)
    
        #have to use a matrix because you can't assign a data.frame 
        # using an addressing table like we have above
        mat <- matrix(ncol=length(ids), nrow=length(myList))
    
        # assign the values to the addresses in the matrix
        mat[address] <- unlist(myList)
    
        # convert to data.frame
        df <- as.data.frame(mat)
        colnames(df) <- ids
    
        df
    }   
    myList <- createList(50000)
    ids <- letters[1:3]
    
    system.time(df <- convertList(myList, ids))
    

    It's taking about 0.29 seconds to convert the 50,000 rows on my laptop (Windows 7, Intel i7 M620 @ 2.67 GHz, 4GB RAM).

    Still very much interested in other answers!

提交回复
热议问题