How do I pass column-specific arguments to lapply in data.table .SD?

微笑、不失礼 提交于 2019-12-23 02:55:18

问题


I have seen examples of using .SDwith lapply in data.table with a simple function as below:

DT[ , .(b,d,e) := lapply(.SD, tan), .SDcols = .(b,d,e)]

But I'm unsure of how to use column-specific arguments in a multiple argument function. For instance I have a winsorize function, I want to apply it to a subset of columns in a data table but using column-specific percentiles, e.g.

library(DescTools)
wlevel <- list(b=list(lower=0.01,upper=0.99), c=list(upper=0.02,upper=0.95))
DT[ , .(b,c) :=lapply(.SD, function(x) 
{winsorize(x,wlevel$zzz$lower,wlevel$zzz$upper)}), .SDcols = .(b,c)]

Where zzz will be the respective column to iterate. I have also seen threads on using changing arguments with lapply but not in the context of data table with .SDcols

Is this possible to do?

This is a toy example, looking to generalize for the case of arbitrary large number of columns; Looping is always an option but trying to see if there's a more elegant/efficient solution...


回答1:


How to use column-specific arguments in a multiple argument function?

Use mapply(FUN, dat, params1, params2, ...) where each of params1, params2, ... can be a list or vector; mapply iterates over each of dat, params1, params2, ... in parallel.

Note that unlike the rest of the apply/lapply/sapply family, with mapply the function argument comes first, then the data and parameter(s).

In your case (pseudo-code, you'll need to tweak it to get it to run) something like:

Instead of your nested list wlevel <- list(b=list(lower=0.01,upper=0.99), c=list(upper=0.02,upper=0.95)), probably easier to unpack to:

w_lower <- list(b=0.01, c=0.02)
w_upper <- list(b=0.99, c=0.95) 

DT[ , c('b','c') := mapply(function(x, w_lower_col, w_upper_col) { winsorize(x, w_lower_col, w_upper_col) },
  .SD, w_lower, w_upper), .SDcols = c('b', 'c')]

We shouldn't need to use column-names (your zzz) in indexing into the list, mapply() should just iterate over the list as-is.



来源:https://stackoverflow.com/questions/52022675/how-do-i-pass-column-specific-arguments-to-lapply-in-data-table-sd

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!