How to conditionally mutate multiple columns using “contains” and “ifelse”?

佐手、 提交于 2019-12-22 18:57:07

问题


I want to mutate multiple columns containing the string "account". Specifically, I want these columns to take "NA" when a certain condition is met, and another value when the condition is not met. Below I present my attempt inspired on here and here. So far, unsuccessful. Still trying, nevertheless any help would be much appreciated.

My data

df<-as.data.frame(structure(list(low_account = c(1, 1, 0.5, 0.5, 0.5, 0.5), high_account = c(16, 
16, 56, 56, 56, 56), mid_account_0 = c(8.5, 8.5, 28.25, 28.25, 
28.25, 28.25), mean_account_0 = c(31.174, 30.1922101449275, 30.1922101449275, 
33.3055555555556, 31.174, 33.3055555555556), median_account_0 = c(2.1, 
3.8, 24.2, 24.2, 24.2, 24.2), low_account.1 = c(1, 1, 0.5, 0.5, 0.5, 
0.5), high_account.1 = c(16, 16, 56, 56, 56, 56), row.names = c("A001", "A002", "A003", "A004", "A005", "A006"))))

df
  low_account high_account mid_account_0 mean_account_0 median_account_0 low_account.1 high_account.1 row.names
1         1.0           16          8.50       31.17400              2.1           1.0             16      A001
2         1.0           16          8.50       30.19221              3.8           1.0             16      A002
3         0.5           56         28.25       30.19221             24.2           0.5             56      A003
4         0.5           56         28.25       33.30556             24.2           0.5             56      A004
5         0.5           56         28.25       31.17400             24.2           0.5             56      A005
6         0.5           56         28.25       33.30556             24.2           0.5             56      A006

My attempt

sample_data<-df%>% mutate_at(select(contains("account") , ifelse(. <= df$low_account&  >= df$high_account, NA, .)))

Error: No tidyselect variables were registered Call rlang::last_error() to see a backtrace

Expected output

df
    low_account high_account mid_account_0 mean_account_0 median_account_0 low_account.1 high_account.1 row.names
    1         1.0           16          8.50       NA                    2.1           1.0             16      A001
    2         1.0           16          8.50       NA                    3.8           1.0             16      A002
    3         0.5           56         28.25       30.19221             24.2           0.5             56      A003
    4         0.5           56         28.25       33.30556             24.2           0.5             56      A004
    5         0.5           56         28.25       31.17400             24.2           0.5             56      A005
    6         0.5           56         28.25       33.30556             24.2           0.5             56      A006

回答1:


The issue with the vars(contains('account')) is that it matches all the columns where the substring 'account' is present and when we do the logical comparison, the 'low_account' column gets converted to NA because it is definitely lower or equal to 'low_account', thus only that NA replaced column is available. So, instead, we can get the columns of interest 'mid', 'median', 'mean' columns and then do the replace

library(tidyverse)
df %>% 
   mutate_at(vars(matches("(mid|mean|median)_account")),
           ~ replace(., .<= low_account | .>= high_account, NA))
# low_account high_account mid_account_0 mean_account_0 median_account_0 low_account.1 high_account.1 row.names
#1         1.0           16          8.50             NA              2.1           1.0             16      A001
#2         1.0           16          8.50             NA              3.8           1.0             16      A002
#3         0.5           56         28.25       30.19221             24.2           0.5             56      A003
#4         0.5           56         28.25       33.30556             24.2           0.5             56      A004
#5         0.5           56         28.25       31.17400             24.2           0.5             56      A005
#6         0.5           56         28.25       33.30556             24.2           0.5             56      A006


来源:https://stackoverflow.com/questions/57043794/how-to-conditionally-mutate-multiple-columns-using-contains-and-ifelse

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!