问题
In R, in aggregate() function, How to specify stopping condition on grouping on applied function on the variable?
For example, I have data-frame like this: "df" Input Data frame
Note: Assuming each row in input data frame is denoting single ball played by a player in that match. So, by counting a number of rows can tell us the number of balls required.
And, I want my data frame like this one: Output data frame My need is: How many balls are required to score 10 runs?
Currently, I am using this R code:
group_data <- aggregate(df$score, by=list(Category=df$player,df$match), FUN=sum,na.rm = TRUE)
Using this code, I can not stop grouping as I want, it stops when it groups all rows. I don't want all rows to consider.
But How to put constraint like "Stop grouping as soon as score >= 10" By putting this constraint, my sole purpose is to count the number of rows satisfying this condition.
Thanks in advance.
回答1:
Here is one option using dplyr
library(dplyr)
df1 %>%
group_by(match, player) %>%
filter(!lag(cumsum(score) > 10, default = FALSE)) %>%
summarise(score = sum(score), Count = n())
# A tibble: 2 x 4
# Groups: match [?]
# match player score Count
# <int> <int> <dbl> <int>
#1 1 30 12 2
#2 2 31 15 3
data
df1 <- structure(list(match = c(1L, 1L, 1L, 2L, 2L, 2L), player = c(30L,
30L, 30L, 31L, 31L, 31L), score = c(6, 6, 6, 3, 6, 6)), .Names = c("match",
"player", "score"), row.names = c(NA, -6L), class = "data.frame")
来源:https://stackoverflow.com/questions/47111544/dynamic-grouping-in-r-grouping-based-on-condition-on-applied-function