Replace missing values (NA) with most recent non-NA by group

前端 未结 7 1100
南旧
南旧 2020-11-22 05:42

I would like to solve the following problem with dplyr. Preferable with one of the window-functions. I have a data frame with houses and buying prices. The following is an e

7条回答
  •  星月不相逢
    2020-11-22 06:06

    tidyr::fill now makes this stupidly easy:

    library(dplyr)
    library(tidyr)
    # or library(tidyverse)
    
    df %>% group_by(houseID) %>% fill(price)
    # Source: local data frame [15 x 3]
    # Groups: houseID [3]
    # 
    #    houseID  year price
    #      (int) (int) (int)
    # 1        1  1995    NA
    # 2        1  1996   100
    # 3        1  1997   100
    # 4        1  1998   120
    # 5        1  1999   120
    # 6        2  1995    NA
    # 7        2  1996    NA
    # 8        2  1997    NA
    # 9        2  1998    30
    # 10       2  1999    30
    # 11       3  1995    NA
    # 12       3  1996    44
    # 13       3  1997    44
    # 14       3  1998    44
    # 15       3  1999    44
    

提交回复
热议问题