How to filter (with dplyr) for all values of a group if variable limit is reached?

牧云@^-^@ 提交于 2019-11-28 05:37:54

问题


Here's the dummy data:

cases <- rep(1:5,times=2)
var1 <- as.numeric(c(450,100,250,999,200,500,980,10,700,1000))
var2 <- as.numeric(c(111,222,333,444,424,634,915,12,105,152))

maindata1 <- data.frame(cases,var1,var2)

df1 <-  maindata1 %>%
  filter(var1 >950) %>%
  distinct(cases) %>%
  select(cases)

table1 <- maindata1 %>%
  filter(cases == 2 | cases == 4 | cases == 5) %>%
  arrange(cases)

> table1
  cases var1 var2
1     2  100  222
2     2  980  915
3     4  999  444
4     4  700  105
5     5  200  424
6     5 1000  152

I'm trying to formulate a dataframe which contains all the data related to cases where var1 >950 so it would show every value of var1 for those cases (also those values which are <950) and all values of var2 and would drop all cases where var1 won't reach >950. Table1 produces the desired dataframe but I had to enter filtering conditions manually. Is there a way to use that df1$cases as a filtering condition for extracting the same dataframe as a result?

I'm new to R and trying to learn data manipulation mainly with dplyr because it's syntax is almost understandable for layman.. so if someone can offer a solution based on dplyr that would be fantastic, of course I'm willing to hear solutions based on other packages as well.


回答1:


Filter by max(var1) in each group defined by cases:

maindata1 %>%
  group_by(cases) %>%
  filter(max(var1) > 950) %>%
  arrange(cases)

#   cases var1 var2
# 1     2  100  222
# 2     2  980  915
# 3     4  999  444
# 4     4  700  105
# 5     5  200  424
# 6     5 1000  152


来源:https://stackoverflow.com/questions/29630045/how-to-filter-with-dplyr-for-all-values-of-a-group-if-variable-limit-is-reache

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!