insert elements in a vector in R

后端 未结 6 1539
悲哀的现实
悲哀的现实 2020-12-01 16:11

I have a vector in R,

a = c(2,3,4,9,10,2,4,19)

let us say I want to efficiently insert the following vectors, b, and c,

b =         


        
相关标签:
6条回答
  • 2020-12-01 16:33

    The straightforward approach:

    b.pos <- 3
    d.pos <- 7
    c(a[1:b.pos],b,a[(b.pos+1):d.pos],d,a[(d.pos+1):length(a)])
    [1]  2  3  4  2  1  9 10  2  4  0  1 19
    

    Note the importance of parenthesis for the boundaries of the : operator.

    0 讨论(0)
  • 2020-12-01 16:34

    Here's an alternative that uses append. It's fine for small vectors, but I can't imagine it being efficient for large vectors since a new vector is created upon each iteration of the loop (which is, obviously, bad). The trick is to reverse the vector of things that need to be inserted to get append to insert them in the correct place relative to the original vector.

    a = c(2,3,4,9,10,2,4,19)
    b = c(2,1)
    d = c(0,1)
    
    pos <- c(3, 7)
    z <- setNames(list(b, d), pos)
    z <- z[order(names(z), decreasing=TRUE)]
    
    
    for (i in seq_along(z)) {
      a <- append(a, z[[i]], after = as.numeric(names(z)[[i]]))
    }
    
    a
    #  [1]  2  3  4  2  1  9 10  2  4  0  1 19
    
    0 讨论(0)
  • 2020-12-01 16:38

    After using Ferdinand's function, I tried to write my own and surprisingly it is far more efficient.
    Here's mine :

    insertElems = function(vect, pos, elems) {
    
    l = length(vect)
      j = 0
      for (i in 1:length(pos)){
        if (pos[i]==1)
          vect = c(elems[j+1], vect)
        else if (pos[i] == length(vect)+1)
          vect = c(vect, elems[j+1])
        else
          vect = c(vect[1:(pos[i]-1+j)], elems[j+1], vect[(pos[i]+j):(l+j)])
        j = j+1
      }
      return(vect)
    }
    
    tmp = c(seq(1:5))
    insertElems(tmp, c(2,4,5), c(NA,NA,NA))
    # [1]  1 NA  2  3 NA  4 NA  5
    
    insert.at(tmp, c(2,4,5), c(NA,NA,NA))
    # [1]  1 NA  2  3 NA  4 NA  5
    

    And there's the benchmark result :

    > microbenchmark(insertElems(tmp, c(2,4,5), c(NA,NA,NA)), insert.at(tmp, c(2,4,5), c(NA,NA,NA)), times = 10000)
    Unit: microseconds
                                            expr    min     lq     mean median     uq      max neval
     insertElems(tmp, c(2, 4, 5), c(NA, NA, NA))  9.660 11.472 13.44247  12.68 13.585 1630.421 10000
       insert.at(tmp, c(2, 4, 5), c(NA, NA, NA)) 58.866 62.791 70.36281  64.30 67.923 2475.366 10000
    

    my code works even better for some cases :

    > insert.at(tmp, c(1,4,5), c(NA,NA,NA))
    # [1]  1  2  3 NA  4 NA  5 NA  1  2  3
    # Warning message:
    # In result[c(TRUE, FALSE)] <- split(a, cumsum(seq_along(a) %in% (pos))) :
    #   number of items to replace is not a multiple of replacement length
    
    > insertElems(tmp, c(1,4,5), c(NA,NA,NA))
    # [1] NA  1  2  3 NA  4 NA  5
    
    0 讨论(0)
  • 2020-12-01 16:42

    Try this:

    result <- vector("list",5)
    result[c(TRUE,FALSE)] <- split(a, cumsum(seq_along(a) %in% (c(3,7)+1)))
    result[c(FALSE,TRUE)] <- list(b,d)
    f <- unlist(result)
    
    identical(f, e)
    #[1] TRUE
    

    EDIT: generalization to arbitrary number of insertions is straightforward:

    insert.at <- function(a, pos, ...){
        dots <- list(...)
        stopifnot(length(dots)==length(pos))
        result <- vector("list",2*length(pos)+1)
        result[c(TRUE,FALSE)] <- split(a, cumsum(seq_along(a) %in% (pos+1)))
        result[c(FALSE,TRUE)] <- dots
        unlist(result)
    }
    
    
    > insert.at(a, c(3,7), b, d)
     [1]  2  3  4  2  1  9 10  2  4  0  1 19
    
    > insert.at(1:10, c(4,7,9), 11, 12, 13)
     [1]  1  2  3  4 11  5  6  7 12  8  9 13 10
    
    > insert.at(1:10, c(4,7,9), 11, 12)
    Error: length(dots) == length(pos) is not TRUE
    

    Note the bonus error checking if the number of positions and insertions do not match.

    0 讨论(0)
  • 2020-12-01 16:42

    You can use the following function,

    ins(a, list(b, d), pos=c(3, 7))
    # [1]  2  3  4  2  1  9 10  2  4  0  1  4 19
    

    where:

    ins <- function(a, to.insert=list(), pos=c()) {
    
      c(a[seq(pos[1])], 
        to.insert[[1]], 
        a[seq(pos[1]+1, pos[2])], 
        to.insert[[2]], 
        a[seq(pos[2], length(a))]
        )
    }
    
    0 讨论(0)
  • 2020-12-01 16:52

    Here's another function, using Ricardo's syntax, Ferdinand's split and @Arun's interleaving trick from another question:

    ins2 <- function(a,bs,pos){
        as <- split(a,cumsum(seq(a)%in%(pos+1)))
        idx <- order(c(seq_along(as),seq_along(bs)))
        unlist(c(as,bs)[idx])
    }
    

    The advantage is that this should extend to more insertions. However, it may produce weird output when passed invalid arguments, e.g., with any(pos > length(a)) or length(bs)!=length(pos).

    You can change the last line to unname(unlist(... if you don't want a's items named.

    0 讨论(0)
提交回复
热议问题