Cleaner way of constructing binary matrix from vector

后端 未结 2 1080
旧时难觅i
旧时难觅i 2020-12-22 02:08

I have a fun challenge: I\'m trying to construct a a binary matrix from an integer vector. The binary matrix should contain as many rows as the length of vector, and as many

相关标签:
2条回答
  • 2020-12-22 02:33

    You can, of course, also just use table:

    > table(sequence(length(playv)), playv)
        playv
         0 1 2 3 4 5
      1  0 1 0 0 0 0
      2  0 0 1 0 0 0
      3  0 0 0 1 0 0
      4  0 0 0 0 0 1
      5  0 1 0 0 0 0
      6  0 0 0 0 0 1
      7  0 0 0 0 0 1
      8  0 0 0 1 0 0
      9  0 0 0 1 0 0
      10 1 0 0 0 0 0
      11 0 1 0 0 0 0
      12 0 1 0 0 0 0
      13 0 0 0 0 1 0
      14 0 0 1 0 0 0
      15 0 0 0 0 1 0
      16 0 0 1 0 0 0
      17 0 0 0 0 1 0
      18 0 0 0 0 0 1
      19 0 0 1 0 0 0
      20 0 0 0 0 1 0
    

    If speed is a concern, I would suggest a manual approach. First, identify the unique values in your vector. Second, create an empty matrix to fill in. Third, use matrix indexing to identify the positions that should be filled in as 1.

    Like this:

    f3 <- function(vec) {
      U <- sort(unique(vec))
      M <- matrix(0, nrow = length(vec), 
                  ncol = length(U), 
                  dimnames = list(NULL, U))
      M[cbind(seq_len(length(vec)), match(vec, U))] <- 1L
      M
    }
    

    Usage would be f3(playv).

    Adding that into the benchmarks, we get:

    library(microbenchmark)
    microbenchmark(f1(v), f2(v), f3(v), times = 10)
    # Unit: milliseconds
    #   expr       min        lq    median        uq       max neval
    #  f1(v) 2104.4808 3151.4308 3314.8173 3344.6696 4023.5246    10
    #  f2(v) 3956.5678 4782.7863 5994.4448 6320.1901 6646.0405    10
    #  f3(v)  486.4406  574.1133  746.9112  927.3407  987.9121    10
    
    0 讨论(0)
  • 2020-12-22 02:38
    set.seed(1)
    playv <- sample(0:5,20,replace=TRUE)
    playv <- as.character(playv)
    results <- model.matrix(~playv-1)
    

    The columns in result you may rename.

    I like the solution provided by Ananda Mahto and compared it to model.matrix. Here is a code

    library(microbenchmark)
    
    set.seed(1)
    v <- sample(1:10,1e6,replace=TRUE)
    
    f1 <- function(vec) {
      vec <- as.character(vec)
      model.matrix(~vec-1)
    }
    
    f2 <- function(vec) {
      table(sequence(length(vec)), vec)
    }
    
    microbenchmark(f1(v), f2(v), times=10)
    

    model.matrix was a little bit faster then table

    Unit: seconds
      expr      min       lq   median       uq      max neval
     f1(v) 2.890084 3.147535 3.296186 3.377536 3.667843    10
     f2(v) 4.824832 5.625541 5.757534 5.918329 5.966332    10
    
    0 讨论(0)
提交回复
热议问题