Collapse consecutive runs of numbers to a string of ranges

前端 未结 3 1293
无人及你
无人及你 2020-12-15 05:57

Let\'s say I have the following vector of numbers:

vec = c(1, 2, 3, 5, 7, 8, 9, 10, 11, 12)

I\'m looking for a function that will create a

3条回答
  •  轮回少年
    2020-12-15 06:33

    EDITS: I sped up docendo's code quite a bit by sorting the vector first, so now they are actually on equal footing.

    I also added alexis' approach.

    readable_integers <- function(integers)
    {
      integers <- sort(unique(integers))
      group <- cumsum(c(0, diff(integers)) != 1)
    
      paste0(vapply(split(integers, group),
               function(x){
                 if (length(x) == 1) as.character(x)
                 else paste0(range(x), collapse = "-")
               },
               character(1)),
               collapse = "; ")
    }
    
    library(microbenchmark)
    vec = c(1, 2, 3, 5, 7, 8, 9, 10, 11, 12)
    microbenchmark(
      docendo = {vec <- sort(vec)
        x <- cumsum(diff(vec) > 1)
       toString(tapply(vec, c(min(x), x), function(y) paste(unique(range(y)), )collapse = "-"))
      },
      Benjamin = readable_integers(vec),
      alexis = {vec <- sort(vec)
                as.character(split(as.integer(vec), cumsum(c(TRUE, diff(vec) != 1))))
                toString(gsub(":", "-", .Last.value))}
    )
    
    Unit: microseconds
         expr     min       lq     mean  median       uq     max neval
      docendo 205.273 220.3755 230.3134 228.293 235.4780 467.142   100
     Benjamin 121.991 128.4420 135.5302 133.574 143.3980 161.286   100
       alexis 121.698 128.0030 137.0374 136.507 143.3975 169.790   100
    
    set.seed(pi)
    vec = sample(1:1000, 900)
    
    set.seed(pi)
    vec = sample(1:1000, 900)
    
    microbenchmark(
      docendo = {vec <- sort(vec)
       x <- cumsum(diff(vec) > 1)
       toString(tapply(sort(vec), c(min(x), x), function(y) paste(unique(range(y)), collapse = "-")))
      },
      Benjamin = readable_integers(vec),
      alexis = {vec <- sort(vec)
                as.character(split(as.integer(vec), cumsum(c(TRUE, diff(vec) != 1))))
                toString(gsub(":", "-", .Last.value))}
    )
    Unit: microseconds
         expr      min        lq      mean    median        uq      max neval
      docendo 1307.294 1353.7735 1420.3088 1379.7265 1427.8190 2554.473   100
     Benjamin  615.525  626.8155  661.2513  638.8385  665.3765 1676.493   100
       alexis  799.684  808.3355  866.1516  820.0650  833.2615 1974.138   100
    

提交回复
热议问题