If there\'s not a quick 1-3 liner for this in R, I\'ll definitely just use linux sort and a short python program using groupby, so don\'t bend over
In dplyr, I would do this by creating another grouping variable for the consecutive rows. This is what the code cumsum(c(1, diff(weight) != 0) is doing in the code chunk below. An example of this is also here.
The group creation can be done within group_by, and then you can proceed accordingly with making any summaries by group.
library(dplyr)
df_in %>%
group_by(ID, group_weight = cumsum(c(1, diff(weight) != 0)), weight) %>%
summarise(start_day = min(start_day), end_day = max(end_day))
Source: local data frame [5 x 5]
Groups: ID, group_weight [?]
ID group_weight weight start_day end_day
(dbl) (dbl) (dbl) (dbl) (dbl)
1 1 1 150 1 7
2 1 2 151 7 10
3 1 3 150 10 30
4 2 4 170 5 20
5 2 5 171 20 30
This approach does leave you with the extra grouping variable in the dataset, which can be removed, if needed, with select(-group_weight) after ungrouping.