all combinations of k numbers between 0 and n whose sum equals n, speed optimization

后端 未结 5 1918
Happy的楠姐
Happy的楠姐 2021-01-25 18:42

I have this R function to generate a matrix of all combinations of k numbers between 0 and n whose sum equals n. This is one of the bottlenecks of my program as it becomes extre

5条回答
  •  自闭症患者
    2021-01-25 19:21

    Here's a different approach, which incrementally expands the set from size 1 to k, at each iteration pruning the combinations whose sums exceed n. This should result in speedups where you have a large k relative to n, because you won't need to compute anything close to the size of the power set.

    sum.comb2 <- function(n, k) {
      combos <- 0:n
      sums <- 0:n
      for (width in 2:k) {
        combos <- apply(expand.grid(combos, 0:n), 1, paste, collapse=" ")
        sums <- apply(expand.grid(sums, 0:n), 1, sum)
        if (width == k) {
          return(combos[sums == n])
        } else {
          combos <- combos[sums <= n]
          sums <- sums[sums <= n]
        }
      }
    }
    
    # Simple test
    sum.comb2(3, 2)
    # [1] "3 0" "2 1" "1 2" "0 3"
    

    Here's an example of the speedups with small n and large k:

    library(microbenchmark)
    microbenchmark(sum.comb2(1, 100))
    # Unit: milliseconds
    #               expr      min      lq   median       uq      max neval
    #  sum.comb2(1, 100) 149.0392 158.716 162.1919 174.0482 236.2095   100
    

    This approach runs in under a second, while of course the approach with the power set would never get past the call to expand.grid, since you'll end up with 2^100 rows in your resulting matrix.

    Even in a less extreme case, with n=3 and k=10, we see a 20x speedup compared to function in the original post:

    microbenchmark(sum.comb(3, 10), sum.comb2(3, 10))
    # Unit: milliseconds
    #              expr       min        lq    median        uq       max neval
    #   sum.comb(3, 10) 404.00895 439.94472 446.67452 461.24909 574.80426   100
    #  sum.comb2(3, 10)  23.27445  24.53771  25.60409  26.97439  65.59576   100
    

提交回复
热议问题