How to convert a huge list-of-vector to a matrix more efficiently?

后端 未结 5 979
别那么骄傲
别那么骄傲 2020-12-07 14:44

I have a list of length 130,000 where each element is a character vector of length 110. I would like to convert this list to a matrix with dimension 1,430,000*10. How can I

相关标签:
5条回答
  • 2020-12-07 15:11

    This should be equivalent to your current code, only a lot faster:

    output <- matrix(unlist(z), ncol = 10, byrow = TRUE)
    
    0 讨论(0)
  • 2020-12-07 15:18

    I think you want

    output <- do.call(rbind,lapply(z,matrix,ncol=10,byrow=TRUE))
    

    i.e. combining @BlueMagister's use of do.call(rbind,...) with an lapply statement to convert the individual list elements into 11*10 matrices ...

    Benchmarks (showing @flodel's unlist solution is 5x faster than mine, and 230x faster than the original approach ...)

    n <- 1000
    z <- replicate(n,matrix(1:110,ncol=10,byrow=TRUE),simplify=FALSE)
    library(rbenchmark)
    origfn <- function(z) {
        output <- NULL 
        for(i in 1:length(z))
            output<- rbind(output,matrix(z[[i]],ncol=10,byrow=TRUE))
    }
    rbindfn <- function(z) do.call(rbind,lapply(z,matrix,ncol=10,byrow=TRUE))
    unlistfn <- function(z) matrix(unlist(z), ncol = 10, byrow = TRUE)
    
    ##          test replications elapsed relative user.self sys.self 
    ## 1   origfn(z)          100  36.467  230.804    34.834    1.540  
    ## 2  rbindfn(z)          100   0.713    4.513     0.708    0.012 
    ## 3 unlistfn(z)          100   0.158    1.000     0.144    0.008 
    

    If this scales appropriately (i.e. you don't run into memory problems), the full problem would take about 130*0.2 seconds = 26 seconds on a comparable machine (I did this on a 2-year-old MacBook Pro).

    0 讨论(0)
  • 2020-12-07 15:23

    You can also use,

    output <- as.matrix(as.data.frame(z))
    

    The memory usage is very similar to

    output <- matrix(unlist(z), ncol = 10, byrow = TRUE)
    

    Which can be verified, with mem_changed() from library(pryr).

    0 讨论(0)
  • 2020-12-07 15:28

    you can use as.matrix as below:

    output <- as.matrix(z)
    
    0 讨论(0)
  • 2020-12-07 15:30

    It would help to have sample information about your output. Recursively using rbind on bigger and bigger things is not recommended. My first guess at something that would help you:

    z <- list(1:3,4:6,7:9)
    do.call(rbind,z)
    

    See a related question for more efficiency, if needed.

    0 讨论(0)
提交回复
热议问题