pass variables and names to data.table function

后端 未结 1 1639
故里飘歌
故里飘歌 2020-12-19 15:55

I have a report that needs to be applied for different names of data.tables [both j and by]. The only way I get it done it by wrapping the arguments in an eval(substit

相关标签:
1条回答
  • 2020-12-19 16:23

    How about eval(substitute'ing the entire body of the function (or just data.table calculation if you want to be more specific):

    var.report = function(df, value, by.value) {
      eval(substitute({
        var.report = df[, list( .N,
                          sum(is.finite(value)), # count values
                          sum(is.na(value)) # count NA
        ), by = by.value]
    
        setnames(var.report, c("variable", "N","n.val","n.NA"))
    
        return(var.report)
      }))
    }
    
    var.report(dt, depth, clarity)
    #   variable     N n.val n.NA
    #1:      SI2  9194  9194    0
    #2:      SI1 13065 13065    0
    #3:      VS1  8171  8171    0
    #4:      VS2 12258 12258    0
    #5:     VVS2  5066  5066    0
    #6:     VVS1  3655  3655    0
    #7:       I1   741   741    0
    #8:       IF  1790  1790    0
    

    I don't really understand the second question and I'd normally assign the names in the original expression, which helps keeping track of things better, like so:

    var.report = df[, list(N     = .N,
                           n.val = sum(is.finite(value)), # count values
                           n.NA  = sum(is.na(value)) # count NA
                          )
                    , by = list(variable = by.value)]
    
    0 讨论(0)
提交回复
热议问题