How to append rows to an R data frame

前端 未结 7 1568
太阳男子
太阳男子 2020-11-28 02:04

I have looked around StackOverflow, but I cannot find a solution specific to my problem, which involves appending rows to an R data frame.

I am initializing an empty

7条回答
  •  伪装坚强ぢ
    2020-11-28 02:35

    Update

    Not knowing what you are trying to do, I'll share one more suggestion: Preallocate vectors of the type you want for each column, insert values into those vectors, and then, at the end, create your data.frame.

    Continuing with Julian's f3 (a preallocated data.frame) as the fastest option so far, defined as:

    # pre-allocate space
    f3 <- function(n){
      df <- data.frame(x = numeric(n), y = character(n), stringsAsFactors = FALSE)
      for(i in 1:n){
        df$x[i] <- i
        df$y[i] <- toString(i)
      }
      df
    }
    

    Here's a similar approach, but one where the data.frame is created as the last step.

    # Use preallocated vectors
    f4 <- function(n) {
      x <- numeric(n)
      y <- character(n)
      for (i in 1:n) {
        x[i] <- i
        y[i] <- i
      }
      data.frame(x, y, stringsAsFactors=FALSE)
    }
    

    microbenchmark from the "microbenchmark" package will give us more comprehensive insight than system.time:

    library(microbenchmark)
    microbenchmark(f1(1000), f3(1000), f4(1000), times = 5)
    # Unit: milliseconds
    #      expr         min          lq      median         uq         max neval
    #  f1(1000) 1024.539618 1029.693877 1045.972666 1055.25931 1112.769176     5
    #  f3(1000)  149.417636  150.529011  150.827393  151.02230  160.637845     5
    #  f4(1000)    7.872647    7.892395    7.901151    7.95077    8.049581     5
    

    f1() (the approach below) is incredibly inefficient because of how often it calls data.frame and because growing objects that way is generally slow in R. f3() is much improved due to preallocation, but the data.frame structure itself might be part of the bottleneck here. f4() tries to bypass that bottleneck without compromising the approach you want to take.


    Original answer

    This is really not a good idea, but if you wanted to do it this way, I guess you can try:

    for (i in 1:10) {
      df <- rbind(df, data.frame(x = i, y = toString(i)))
    }
    

    Note that in your code, there is one other problem:

    • You should use stringsAsFactors if you want the characters to not get converted to factors. Use: df = data.frame(x = numeric(), y = character(), stringsAsFactors = FALSE)

提交回复
热议问题