Group Data in R for consecutive rows

后端未结

关注

 3  1855

故里飘歌 2020-12-10 18:13

If there\'s not a quick 1-3 liner for this in R, I\'ll definitely just use linux sort and a short python program using groupby, so don\'t bend over

3条回答

庸人自扰 (楼主)

2020-12-10 18:42
In dplyr, I would do this by creating another grouping variable for the consecutive rows. This is what the code cumsum(c(1, diff(weight) != 0) is doing in the code chunk below. An example of this is also here.

The group creation can be done within group_by, and then you can proceed accordingly with making any summaries by group.
```
library(dplyr)

df_in %>%
    group_by(ID, group_weight = cumsum(c(1, diff(weight) != 0)), weight) %>%
    summarise(start_day = min(start_day), end_day = max(end_day))

Source: local data frame [5 x 5]
Groups: ID, group_weight [?]

     ID group_weight weight start_day end_day
  (dbl)        (dbl)  (dbl)     (dbl)   (dbl)
1     1            1    150         1       7
2     1            2    151         7      10
3     1            3    150        10      30
4     2            4    170         5      20
5     2            5    171        20      30
```
This approach does leave you with the extra grouping variable in the dataset, which can be removed, if needed, with select(-group_weight) after ungrouping.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...