问题
I'd like to get a summary of time series data where group is "Flare" and the max value of the FlareLength is the data of interest for that group.
If I have a dataframe, like this:
Date Flare FlareLength
1 2015-12-01 0 1
2 2015-12-02 0 2
3 2015-12-03 0 3
4 2015-12-04 0 4
5 2015-12-05 0 5
6 2015-12-06 0 6
7 2015-12-07 1 1
8 2015-12-08 1 2
9 2015-12-09 1 3
10 2015-12-10 1 4
11 2015-12-11 0 1
12 2015-12-12 0 2
13 2015-12-13 0 3
14 2015-12-14 0 4
15 2015-12-15 0 5
16 2015-12-16 0 6
17 2015-12-17 0 7
18 2015-12-18 0 8
19 2015-12-19 0 9
20 2015-12-20 0 10
21 2015-12-21 0 11
22 2016-01-11 1 1
23 2016-01-12 1 2
24 2016-01-13 1 3
25 2016-01-14 1 4
26 2016-01-15 1 5
27 2016-01-16 1 6
28 2016-01-17 1 7
29 2016-01-18 1 8
I'd like output like:
Date Flare FlareLength
1 2015-12-06 0 6
2 2015-12-10 1 4
3 2015-12-21 0 11
4 2016-01-18 1 8
I have tried various aggregate forms but I'm not very familiar with the time series wrinkle.
回答1:
Using dplyr
, we can create a grouping variable by comparing the FlareLength
with the previous FlareLength
value and select the row with maximum
FlareLength
in the group.
library(dplyr)
df %>%
group_by(gr = cumsum(FlareLength < lag(FlareLength,
default = first(FlareLength)))) %>%
slice(which.max(FlareLength)) %>%
ungroup() %>%
select(-gr)
# A tibble: 4 x 3
# Date Flare FlareLength
# <fct> <int> <int>
#1 2015-12-06 0 6
#2 2015-12-10 1 4
#3 2015-12-21 0 11
#4 2016-01-18 1 8
In base R with ave
we can do the same as
subset(df, FlareLength == ave(FlareLength, cumsum(c(TRUE, diff(FlareLength) < 0)),
FUN = max))
来源:https://stackoverflow.com/questions/59960252/r-code-to-get-max-count-of-time-series-data-by-group