I am literally stuck on this. The df1
has the following variables:
serial
= Group of people
id1
= th
You can make use of lead
and lag
of dplyr
,
I tried it on my side and here is the result:
library(dplyr)
df %>%
select(serial, contains("day", ignore.case = FALSE)) %>%
group_by(serial) %>%
tidyr::gather(day, val, -serial) %>%
# convert to binary
mutate(occur = ifelse(val > 0, 1, 0)) %>%
# if detect a seq, add cumulative, else 0
mutate(cums = ifelse(lead(occur) > 0 & lag(occur) > 0 & occur > 0,
occur + cumsum(occur), 0)) %>%
summarise(occurance = max(cums, na.rm = T), duration = sum(val))
# A tibble: 3 x 3
serial occurance duration
1 10 6 18
2 12 7 11
3 123 0 12