问题
Given the dplyr workflow:
require(dplyr)
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
filter(grepl(x = model, pattern = "Merc")) %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
I'm interested in conditionally applying filter depending on the value of applyFilter.
Solution
For applyFilter <- 1 the rows are filtered with use of the "Merc" string, without the filter all rows are returned.
applyFilter <- 1
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
filter(model %in%
if (applyFilter) {
rownames(mtcars)[grepl(x = rownames(mtcars), pattern = "Merc")]
} else
{
rownames(mtcars)
}) %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
Problem
The suggested solution is inefficient as the ifelse call is always evaluated; a more desireable approach would only evaluate the filter step for applyFilter <- 1.
Attempt
The inefficient working solution would look like that:
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
# Only apply filter step if condition is met
if (applyFilter) {
filter(grepl(x = model, pattern = "Merc"))
}
%>%
# Continue
group_by(am) %>%
summarise(meanMPG = mean(mpg))
Naturally, the syntax above is incorrect. It's only a illustration how the ideal workflow should look.
Desired answer
I'm not interested in creating an interim object; the workflow should resemble:
startingObject %>% ... conditional filter ... final objectIdeally, I would like to arrive at solution where I can control whether the
filtercall is being evaluated or not
回答1:
How about this approach:
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
filter(if(applyfilter== 1) grepl(x = model, pattern = "Merc") else TRUE) %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
This means grepl is only evaluated if the applyfilter is 1, otherwise the filter simply recycles a TRUE.
Or another option is to use {}:
mtcars %>%
tibble::rownames_to_column(var = "model") %>%
{if(applyfilter == 1) filter(., grepl(x = model, pattern = "Merc")) else .} %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
There's obviously another possible approach in which you would simply break the pipe, conditionally do the filter and then continue the pipe (I know OP didn't ask for this, just want to give another example for other readers)
mtcars %<>%
tibble::rownames_to_column(var = "model")
if(applyfilter == 1) mtcars %<>% filter(grepl(x = model, pattern = "Merc"))
mtcars %>%
group_by(am) %>%
summarise(meanMPG = mean(mpg))
来源:https://stackoverflow.com/questions/44001722/conditionally-apply-pipeline-step-depending-on-external-value