What is the equivalent of mutate_at (dplyr) in data.table?

后端 未结 2 1480
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-12-20 23:15

I am trying to move some of my slower processes in dplyr to using data.table, however can not seem to find an efficient way of using a \"mutate_at\" type approach in data.ta

相关标签:
2条回答
  • 2020-12-20 23:58

    And also from Apply multiple functions to multiple columns in data.table

    f <- function(x) lapply(c(Roll.Mean.Week=7L, Roll.Mean.Two.Week=14L),
        function(n) rollapplyr(x, width=n, mean, fill=0L, na.rm=TRUE, partial=TRUE))
    DT[, unlist(lapply(.SD, f), recursive=FALSE), A, .SDcols=B:C]
    

    output:

         A B._Roll.Mean.Week B._Roll.Mean.Two.Week C._Roll.Mean.Week C._Roll.Mean.Two.Week
      1: 1                 1                     1               101                   101
      2: 1                 2                     2               102                   102
      3: 1                 3                     3               103                   103
      4: 1                 4                     4               104                   104
      5: 1                 5                     5               105                   105
      6: 1                 6                     6               106                   106
      7: 1                 7                     7               107                   107
      8: 1                 9                     8               109                   108
      9: 1                11                     9               111                   109
     10: 1                13                    10               113                   110
     [trunc...]
    
    0 讨论(0)
  • 2020-12-21 00:03

    With data.table, we can specify the columns of interest in .SDcols, loop through the .SD with lapply and apply the function of interest. Here, the funcion rollapply is repeated with only change in width parameter. So, it may be better to create a function to avoid repeating the whole arguments. Also, while applying the function (f1), the output can be kept in a list, later unlist with recursive = FALSE and assign (:=) to columns of interest

    library(data.table)
    library(zoo)
    nm1 <- c("B", "C")
    nm2 <- paste0(nm1, "_Roll.Mean.Week")
    nm3 <- paste0(nm1, "_Roll.Mean.Two.Week")
    f1 <- function(x, width) rollapply(x, width = width, mean,
            align = "right", fill = 0, na.rm = TRUE, partial = TRUE)
    setDT(Data)[, c(nm2, nm3) := unlist(lapply(.SD, function(x)
      list(f1(x, 7), f1(x, 14))), recursive = FALSE), by = A, .SDcols = nm1]
    head(Data)
    #   A B   C B_Roll.Mean.Week C_Roll.Mean.Week B_Roll.Mean.Two.Week C_Roll.Mean.Two.Week
    #1: 1 1 101                1                1                  101                  101
    #2: 2 2 102                2                2                  102                  102
    #3: 1 3 103                2                2                  102                  102
    #4: 2 4 104                3                3                  103                  103
    #5: 1 5 105                3                3                  103                  103
    #6: 2 6 106                4                4                  104                  104
    

    Note that funs is deprecated in tidyverse and in its place, can use list(~ or just ~

    Data %>% 
        group_by(A) %>% 
        mutate_at(vars(B,C), list(Roll.Mean.Week =  ~f1(., 7),
                                  Roll.Mean.Two.Week = ~ f1(., 14)))%>% 
        ungroup()
    
    0 讨论(0)
提交回复
热议问题