Update an entire row in data.table in R

前端 未结 2 2021
野趣味
野趣味 2021-01-04 04:56

I have a data.table object in R that has 9,000 columns. My code calculates new values for all 9,000 columns at once and returns a vector of values. I\'d like to just repla

2条回答
  •  日久生厌
    2021-01-04 05:24

    This can also be done using set, for the example above (referencing by row number).

    set(d, 1L, names(d), as.list(vec))
    

    You may gain some speed using set instead, but lose some of the advantage if you need to retrieve the row numbers first.

    # Create large data table
    DT = data.table(col1 = 1:1e5)
    cols = paste0('col', 1:9e3)
    for (col in cols){ DT[, (col) := 1:1e5] }
    vec <- rep(5,9e3)
    
    # Test options
    microbenchmark(
      row_idnx <- DT[,.I[col1 == 1L]], # Retrieve row number
      set(DT, row_idnx, names(DT), as.list(vec)),
      DT[col1 == 1L, names(DT) := as.list(vec)]
    )
    
    Unit: microseconds
                                              expr      min        lq      mean    median        uq       max neval
                  row_idnx <- DT[, .I[col1 == 1L]] 1255.430 1969.5630 2168.9744 2129.2635 2302.1000  3269.947   100
        set(DT, row_idnx, names(DT), as.list(vec))  171.606  207.3235  323.7642  236.6765  274.6515  7725.120   100
     DT[col1 == 1L, `:=`(names(DT), as.list(vec))] 2761.289 2998.3750 3361.7842 3155.8165 3444.6310 13473.081   100
    

提交回复
热议问题