问题
I have this dataframe that I'll like to subset (if possible, with dplyr or base R functions):
df <- data.frame(x = c(1,1,1,2,2,2), y = c(30,10,8,10,18,5))
x y
1 30
1 10
1 8
2 10
2 18
2 5
Assuming x are factors (so 2 conditions/levels), how can I subset/filter this dataframe so that I get only df$y values that are greater than 15 for df$x == 1, and df$y values that are greater than 5 for df$x == 2?
This is what I'd like to get:
df2 <- data.frame(x = c(1,2,2), y = c(30,10,18))
x y
1 30
2 10
2 18
Appreciate any help! Thanks!
回答1:
you can try this
with(df, df[ (x==1 & y>15) | (x==2 & y>5), ])
x y
1 1 30
4 2 10
5 2 18
or with dplyr
library(dplyr)
filter(df, (x==1 & y>15) | (x==2 & y>5))
回答2:
If you have several 'x' groups, one option would be to use mapply. We split the 'y' using 'x' as grouping variable, create the vector of values to compare against (c(15,5)) and use mapply to get the logical index for subsetting the 'df'.
df[unlist(mapply('>', split(df$y, df$x), c(15,5))),]
# x y
#1 1 30
#4 2 10
#5 2 18
来源:https://stackoverflow.com/questions/30037199/how-to-filter-dataframe-with-multiple-conditions