问题
I want to use variable names as strings in functions of dplyr
. See the example below:
df <- data.frame(
color = c(\"blue\", \"black\", \"blue\", \"blue\", \"black\"),
value = 1:5)
filter(df, color == \"blue\")
It works perfectly, but I would like to refer to color
by string, something like this:
var <- \"color\"
filter(df, this_probably_should_be_a_function(var) == \"blue\").
I would be happy, to do this by any means and super-happy to make use of easy-to-read dplyr
syntax.
回答1:
For dplyr
versions [0.3 - 0.7) (? - June 2017)
(For more recent dplyr
versions, please see other answers to this question)
As of dplyr 0.3
every dplyr
function using non standard evaluation (NSE, see release post and vignette) has a standard evaluation (SE) twin ending in an underscore. These can be used for passing variables. For filter
it will be filter_
. Using filter_
you may pass the logical condition as a string.
filter_(df, "color=='blue'")
# color value
# 1 blue 1
# 2 blue 3
# 3 blue 4
Construing the string with the logical condition is of course straighforward
l <- paste(var, "==", "'blue'")
filter_(df, l)
回答2:
In the newer versions, we can use we can create the variables as quoted and then unquote (UQ
or !!
) for evaluation
var <- quo(color)
filter(df, UQ(var) == "blue")
# color value
#1 blue 1
#2 blue 3
#3 blue 4
Due to operator precedence, we may require ()
to wrap around !!
filter(df, (!!var) == "blue")
# color value
#1 blue 1
#2 blue 3
#3 blue 4
With new version, ||
have higher precedence, so
filter(df, !! var == "blue")
should work (as @Moody_Mudskipper commented)
Older option
We may also use:
filter(df, get(var, envir=as.environment(df))=="blue")
#color value
#1 blue 1
#2 blue 3
#3 blue 4
EDIT: Rearranged the order of solutions
回答3:
As of dplyr 0.7, some things have changed again.
library(dplyr)
df <- data.frame(
color = c("blue", "black", "blue", "blue", "black"),
value = 1:5)
filter(df, color == "blue")
# it was already possible to use a variable for the value
val <- 'blue'
filter(df, color == val)
# As of dplyr 0.7, new functions were introduced to simplify the situation
col_name <- quo(color) # captures the current environment
df %>% filter((!!col_name) == val)
# Remember to use enquo within a function
filter_col <- function(df, col_name, val){
col_name <- enquo(col_name) # captures the environment in which the function was called
df %>% filter((!!col_name) == val)
}
filter_col(df, color, 'blue')
More general cases are explained in the dplyr programming vignette.
回答4:
Often asked, but still no easy support afaik. However, with regards to this posting:
eval(substitute(filter(df, var == "blue"),
list(var = as.name(var))))
# color value
# 1 blue 1
# 2 blue 3
# 3 blue 4
回答5:
Here is one way to do it using the sym()
function in the rlang
package:
library(dplyr)
df <- data.frame(
main_color = c("blue", "black", "blue", "blue", "black"),
secondary_color = c("red", "green", "black", "black", "red"),
value = 1:5,
stringsAsFactors=FALSE
)
filter_with_quoted_text <- function(column_string, value) {
col_name <- rlang::sym(column_string)
df1 <- df %>%
filter(UQ(col_name) == UQ(value))
df1
}
filter_with_quoted_text("main_color", "blue")
filter_with_quoted_text("secondary_color", "red")
回答6:
new with rlang
version >= 0.4.0
.data
is now recognized as a way to refer to the parent data frame, so reference by string works as follows:
var <- "color"
filter(df, .data[[var]] == "blue")
If the variable is already a symbol, then {{}}
will dereference it properly
example 1:
var <- quo(color)
filter(df, {{var}} == "blue")
or more realistically
f <- function(v) {
filter(df, {{v}} == "blue")
}
f(color) # Curly-curly provides automatic NSE support
来源:https://stackoverflow.com/questions/24569154/use-variable-names-in-functions-of-dplyr