问题
df <-
data.frame(a=LETTERS[1:4],
b=rnorm(4)
)
vals <- c("B","D")
I can filter/subset df
with values in val
with:
dplyr::filter(df, a %in% vals)
subset(df, a %in% vals)
Both gives:
a b
2 B 0.4481627
4 D 0.2916513
What if I have a variable name in a vector, e.g.:
> names(df)[1]
[1] "a"
Then it doesnt work - I guess because its quoted
dplyr::filter(df, names(df)[1] %in% vals)
[1] a b
<0 rows> (or 0-length row.names)
How do you do this ?
UPDATE ( what if its dplyr::tbl_df(df) )
Answers below work fine for data.frames, but not for dplyr::tbl_df wrapped data:
df<-dplyr::tbl_df(df)
dplyr::filter(df, df[,names(df)[1]] %in% vals)
Does not work (I thought tbl_df
was a simple wrap on top of df ? )
This does work again:
dplyr::filter(df, as.data.frame(df)[,names(df)[1]] %in% vals)
FINAL UPDATE: It works with tbl_df() using lazyeval::interp
See AndreyAkinshin's solution below.
回答1:
You can use df[,"a"]
or df[,1]
:
df <- data.frame(a = LETTERS[1:4], b = rnorm(4))
vals <- c("B","D")
dplyr::filter(df, df[,1] %in% vals)
# a b
# 2 B 0.4481627
# 4 D 0.2916513
subset(df, df[,1] %in% vals)
# a b
# 2 B 0.4481627
# 4 D 0.2916513
dplyr::filter(df, df[,"a"] %in% vals)
# a b
# 2 B 0.4481627
# 4 D 0.2916513
subset(df, df[,"a"] %in% vals)
# a b
# 2 B 0.4481627
# 4 D 0.2916513
Working with dplyr::tbl_df(df)
Some magic with lazyeval::interp
helps us!
df <- dplyr::tbl_df(df)
expr <- lazyeval::interp(quote(x %in% y), x = as.name(names(df)[1]), y = vals)
df %>% filter_(expr)
# Source: local data frame [2 x 2]
#
# a b
# 1 B 0.4481627
# 2 D 0.2916513
回答2:
A simple way to solve this problem in the tidyverse:
library(tidyverse)
df <- data.frame(a = LETTERS[1:4], b = rnorm(4))
vals <- c("B","D")
df %>% filter(!!sym(names(.)[1]) %in% vals)
来源:https://stackoverflow.com/questions/31358953/in-r-subset-or-dplyrfilter-with-variable-from-vector