Replace missing values (NA) with most recent non-NA by group

前端 未结 7 1028
南旧
南旧 2020-11-22 05:42

I would like to solve the following problem with dplyr. Preferable with one of the window-functions. I have a data frame with houses and buying prices. The following is an e

7条回答
  •  温柔的废话
    2020-11-22 06:17

    You can do a rolling self-join, supported by data.table:

    require(data.table)
    setDT(df)   ## change it to data.table in place
    setkey(df, houseID, year)     ## needed for fast join
    df.woNA <- df[!is.na(price)]  ## version without the NA rows
    
    # rolling self-join will return what you want
    df.woNA[df, roll=TRUE]  ## will match previous year if year not found
    

提交回复
热议问题