R code to get max count of time series data by group

旧城冷巷雨未停 提交于 2020-03-04 17:53:27

问题


I'd like to get a summary of time series data where group is "Flare" and the max value of the FlareLength is the data of interest for that group.

If I have a dataframe, like this:


   Date           Flare       FlareLength
1  2015-12-01     0           1
2  2015-12-02     0           2
3  2015-12-03     0           3
4  2015-12-04     0           4
5  2015-12-05     0           5
6  2015-12-06     0           6
7  2015-12-07     1           1
8  2015-12-08     1           2
9  2015-12-09     1           3
10 2015-12-10     1           4
11 2015-12-11     0           1
12 2015-12-12     0           2
13 2015-12-13     0           3
14 2015-12-14     0           4
15 2015-12-15     0           5
16 2015-12-16     0           6
17 2015-12-17     0           7
18 2015-12-18     0           8
19 2015-12-19     0           9
20 2015-12-20     0          10
21 2015-12-21     0          11
22 2016-01-11     1           1
23 2016-01-12     1           2
24 2016-01-13     1           3
25 2016-01-14     1           4
26 2016-01-15     1           5
27 2016-01-16     1           6
28 2016-01-17     1           7
29 2016-01-18     1           8

I'd like output like:

  Date           Flare       FlareLength
1 2015-12-06     0           6
2 2015-12-10     1           4
3 2015-12-21     0          11
4 2016-01-18     1           8

I have tried various aggregate forms but I'm not very familiar with the time series wrinkle.


回答1:


Using dplyr, we can create a grouping variable by comparing the FlareLength with the previous FlareLength value and select the row with maximum FlareLength in the group.

library(dplyr)

df %>%
  group_by(gr = cumsum(FlareLength < lag(FlareLength, 
                       default = first(FlareLength)))) %>%
  slice(which.max(FlareLength)) %>%
  ungroup() %>%
  select(-gr)

# A tibble: 4 x 3
#  Date       Flare FlareLength
#  <fct>      <int>       <int>
#1 2015-12-06     0           6
#2 2015-12-10     1           4
#3 2015-12-21     0          11
#4 2016-01-18     1           8

In base R with ave we can do the same as

subset(df, FlareLength == ave(FlareLength, cumsum(c(TRUE, diff(FlareLength) < 0)), 
           FUN = max))


来源:https://stackoverflow.com/questions/59960252/r-code-to-get-max-count-of-time-series-data-by-group

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!