Multiple variable filters in r

后端 未结 3 437
陌清茗
陌清茗 2021-01-24 07:53

I\'m trying to figure out the best way how to use multiple variable filters in R.

Usually have up to 100 variables (in one condition) and need to filter cases where ANY

3条回答
  •  孤独总比滥情好
    2021-01-24 08:23

    I think your second statement in base R is OK, just try it with [ instead of subset:

    rowSums(df[sprintf("x%d", 1:10)]==37) > 0
    

    Benchmarks:

    library(microbenchmark)
    microbenchmark( times = 20, 
      subset = {((rowSums(subset(df,select=c(x1:x10))==37)>0))},
      dt_reduce = {dt[, Reduce('|', lapply(.SD, '==', 37)), .SDcols= x1:x10]},
      base_r = {rowSums(df[sprintf("x%d", 1:10)]==37) > 0}
    )
    
    # Unit: milliseconds
    #       expr      min       lq     mean   median        uq       max neval
    #     subset 82.74922 88.63819 99.69935 91.18369 110.24876 134.06550    20
    #  dt_reduce 25.78002 28.62765 32.73945 28.89021  29.12712  71.25822    20
    #     base_r 21.52504 24.27624 27.03380 25.83219  26.24400  65.38550    20
    

提交回复
热议问题