Multiple variable filters in r

后端 未结 3 440
陌清茗
陌清茗 2021-01-24 07:53

I\'m trying to figure out the best way how to use multiple variable filters in R.

Usually have up to 100 variables (in one condition) and need to filter cases where ANY

3条回答
  •  被撕碎了的回忆
    2021-01-24 08:17

    You're looking for a function that works on every row of your dataframe. That's what "apply" is doing. It's equally fast as the solution of others, but easy to handle:

    system.time(
    ((x1==37) | (x2==37) | (x3==37) | (x4==37) | (x5==37) | (x6==37) | (x7==37) | (x8==37) | (x9==37) | (x10==37))
     )
    # user  system elapsed 
    # 0.02    0.00    0.02 
    
     system.time(
        apply(df, 1 , function(x) any(x[2:11]==37))
     )
    # user  system elapsed 
    # 0.59    0.00    0.61 
    

    Although you don't ask for changing data structure, I recommend have a look at tidy data. With a rearranged version of your dataframe you can do filterings efficient and easy to handle:

    library(tidyr)
    df2 = gather(df, key, value, -id)
    
    system.time(
        select(filter(df, value==37), id)
    )
    #   user  system elapsed 
    #   0.02    0.00    0.02
    

提交回复
热议问题