Count distinct by group- moving window

前端 未结 3 1313
没有蜡笔的小新
没有蜡笔的小新 2021-01-15 14:02

Let\'s say I have a dataset contain visits in a hospital. My goal is to generate a variable that counts the number of unique patients the visitor has seen before at the date

3条回答
  •  春和景丽
    2021-01-15 14:18

    We can use dplyr

    library(dplyr)   
    df1 %>%
       group_by(visitor) %>%
        mutate(goal = cummax(match(patient, unique(patient))))
        #or with factor
        # mutate(goal1 = cummax(as.integer(factor(patient, levels = unique(patient)))))
    
    # A tibble: 10 x 4
    # Groups:   visitor [1]
    #   visitor visitdt   patient  goal
    #              
    # 1  125469 1/12/2018   15200     1
    # 2  125469 1/19/2018   15200     1
    # 3  125469 2/16/2018   15200     1
    # 4  125469 2/23/2018   52607     2
    # 5  125469 3/9/2018    52607     2
    # 6  125469 3/16/2018   52607     2
    # 7  125469 3/23/2018   15200     2
    # 8  125469 3/29/2018   15200     2
    # 9  125469 3/30/2018   20589     3
    #10  125469 4/6/2018    20589     3
    

    data

    df1 <- structure(list(visitor = c(125469L, 125469L, 125469L, 125469L, 
    125469L, 125469L, 125469L, 125469L, 125469L, 125469L), visitdt = c("1/12/2018", 
    "1/19/2018", "2/16/2018", "2/23/2018", "3/9/2018", "3/16/2018", 
    "3/23/2018", "3/29/2018", "3/30/2018", "4/6/2018"), patient = c(15200L, 
    15200L, 15200L, 52607L, 52607L, 52607L, 15200L, 15200L, 20589L, 
    20589L), goal = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L)),
    class = "data.frame", row.names = c(NA, 
    -10L))
    

提交回复
热议问题