writing to a dataframe from a for-loop in R

前端 未结 4 560
遥遥无期
遥遥无期 2020-12-08 08:50

I\'m trying to write from a loop to a data frame in R, for example a loop like this>

for (i in 1:20) {
print(c(i+i,i*i,i/1))}

and to write

相关标签:
4条回答
  • 2020-12-08 09:04

    For loops have side-effects, so the usual way of doing this is to create an empty dataframe before the loop and then add to it on each iteration. You can instantiate it to the correct size and then assign your values to the i'th row on each iteration, or else add to it and reassign the whole thing using rbind().

    The former approach will have better performance for large datasets.

    0 讨论(0)
  • 2020-12-08 09:13

    If all your values have the same type and you know the number of rows, you can use a matrix in the following way (this will be very fast):

    d <- matrix(nrow=20, ncol=3) 
    for (i in 1:20) { d[i,] <- c(i+i, i*i, i/1)}
    

    If you need a data frame, you can use rbind (as another answer suggests), or functions from package plyr like this:

    library(plyr)
    ldply(1:20, function(i)c(i+i, i*i, i/1))
    
    0 讨论(0)
  • 2020-12-08 09:16

    You could use rbind:

    d <- data.frame()
    for (i in 1:20) {d <- rbind(d,c(i+i, i*i, i/1))}
    
    0 讨论(0)
  • 2020-12-08 09:25

    Another way would be

    do.call("rbind", sapply(1:20, FUN = function(i) c(i+i,i*i,i/1), simplify = FALSE))
    
    
         [,1] [,2] [,3]
     [1,]    2    1    1
     [2,]    4    4    2
     [3,]    6    9    3
     [4,]    8   16    4
     [5,]   10   25    5
     [6,]   12   36    6
    

    If you don't specify simplify = FALSE, you have to transpose the result using t. This can be tedious for large structures.

    This solution is especially handy if you have a data set on the large side and/or you need to repeat this many many times.

    I offer some timings of solutions in this "thread".

    > system.time(do.call("rbind", sapply(1:20000, FUN = function(i) c(i+i,i*i,i/1), simplify = FALSE)))
       user  system elapsed 
       0.05    0.00    0.05 
    
    > system.time(ldply(1:20000, function(i)c(i+i, i*i, i/1)))
       user  system elapsed 
       0.14    0.00    0.14 
    
    > system.time({d <- matrix(nrow=20000, ncol=3) 
    + for (i in 1:20000) { d[i,] <- c(i+i, i*i, i/1)}})
       user  system elapsed 
       0.10    0.00    0.09 
    
    > system.time(ldply(1:20000, function(i)c(i+i, i*i, i/1)))
       user  system elapsed 
      62.88    0.00   62.99 
    
    0 讨论(0)
提交回复
热议问题